Compare commits

..

47 Commits

Author SHA1 Message Date
ad117f4592 Update recommended Terraform provider versions
* Recommend provider plugin version tested against
2019-11-13 13:53:46 -08:00
42b6df89c8 Update Prometheus from v2.14.0-rc.0 to v2.14.0
* https://github.com/prometheus/prometheus/releases/tag/v2.14.0
2019-11-13 13:41:11 -08:00
d7061020ba Update Kubernetes from v1.16.2 to v1.16.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.16.md#v1163
2019-11-13 13:05:15 -08:00
a8b7792338 Update Grafana from v6.4.3 to v6.4.4
* https://github.com/grafana/grafana/releases/tag/v6.4.4
2019-11-07 12:00:25 -08:00
a3807086d4 Update Prometheus from v2.13.1 to v2.14.0-rc.0
* Happy PromCon 2019!
* https://github.com/prometheus/prometheus/releases/tag/v2.14.0-rc.0
2019-11-07 11:48:23 -08:00
2c163503f1 Update etcd from v3.4.2 to v3.4.3
* etcd v3.4.3 builds with Go v1.12.12 instead of v1.12.9
and adds a few minor metrics fixes
* https://github.com/etcd-io/etcd/compare/v3.4.2...v3.4.3
2019-11-07 11:41:01 -08:00
0034a15711 Update Calico from v3.10.0 to v3.10.1
* https://docs.projectcalico.org/v3.10/release-notes/
2019-11-07 11:38:32 -08:00
38957163cb Output resource_group_id in Azure (#577)
* Add an output variable `resource_group_id` to the azure module
2019-10-31 01:05:04 -07:00
d4573092b5 Improve Kubelet and Compute Resource dashboards
* Add cluster filter to Kubelet dashboard
* Add network details in resource dashboards
* https://github.com/kubernetes-monitoring/kubernetes-mixin/pull/275
* https://github.com/kubernetes-monitoring/kubernetes-mixin/pull/284
* https://github.com/kubernetes-monitoring/kubernetes-mixin/pull/285
2019-10-28 02:22:15 -07:00
4775e9d0f7 Upgrade Calico v3.9.2 to v3.10.0
* Allow advertising Kubernetes service ClusterIPs to BGPPeer
routers via a BGPConfiguration
* Improve EdgeRouter docs about routes and BGP
* https://docs.projectcalico.org/v3.10/release-notes/
* https://docs.projectcalico.org/v3.10/networking/advertise-service-ips
2019-10-27 14:13:41 -07:00
d418045929 Switch kube-proxy from iptables mode to ipvs mode
* Kubernetes v1.11 considered kube-proxy IPVS mode GA
* Many problems were found #321
* Since then, major blockers seem to have been addressed
2019-10-27 00:37:41 -07:00
eb7b6d39f2 Improve minor aspects of CoreDNS and nginx-ingress dashboards
* Add default 10s refresh rate to custom dashboards to match
those from Kubernetes
* Show labels for "instance" as "pod" for clarity
* Add cluster filter for internal use
2019-10-20 23:16:55 -07:00
33d4c2fd68 Add explicit annotation for Prometheus port to scrape
* Without the prometheus.io/port annotation, Prometheus
service discovery can scrape other Prometheus ports that
may be available.
* For example, Prometheus sidecars (not included) may
be scraped and that may be unintended
2019-10-20 16:05:09 -07:00
de90cb9246 Remove kube-state-metrics addon-resizer
* addon-resizer is outdated and has been dropped from
kube-state-metrics examples. Those using it should look
to the cluster-proportional-vertical-autoscaler.
* Eliminate addon-resizer log spew
* Remove associated Role and RoleBinding
* Also fix kube-state-metrics readinessProbe port
2019-10-20 16:03:29 -07:00
68da420adc Refresh Prometheus rules/alerts and Grafana dashboards
* Update Prometheus rules/alerts and Grafana dashboards
* Remove dashboards that were moved to node-exporter, they
may be added back later if valuable
* Remove kube-prometheus based rules/alerts (ClockSkew alert)
2019-10-19 17:43:47 -07:00
130c97f8eb Update Prometheus from v2.13.0 to v2.13.1
* https://github.com/prometheus/prometheus/releases/tag/v2.13.1
2019-10-18 00:10:25 -07:00
271d2f6b52 Update Grafana from v6.4.2 to v6.4.3
* https://github.com/grafana/grafana/releases/tag/v6.4.3
2019-10-18 00:08:39 -07:00
0595915a19 Cleanup CHANGES notes 2019-10-15 23:25:45 -07:00
e6bc5143aa Default to Calico as the CNI provider on Azure/DigitalOcean
* Change `networking` default from flannel to calico on
Azure and DigitalOcean
* AWS, bare-metal, and Google Cloud continue to default
to Calico (as they have since v1.7.5)
* Typhoon now defaults to using Calico and supporting
NetworkPolicy on all platforms
2019-10-15 23:15:40 -07:00
e4ac1027c8 Update Grafana from v6.4.1 to v6.4.2
* https://github.com/grafana/grafana/releases/tag/v6.4.2
2019-10-15 22:58:43 -07:00
24fc440d83 Update Kubernetes from v1.16.1 to v1.16.2
* Update Calico from v3.9.1 to v3.9.2
2019-10-15 22:42:52 -07:00
a6702573a2 Update etcd from v3.4.1 to v3.4.2
* https://github.com/etcd-io/etcd/releases/tag/v3.4.2
2019-10-15 00:06:15 -07:00
69188af565 Rename CLUO label from "app" to "name"
* Match the labeling pattern in other addons
2019-10-15 00:05:02 -07:00
d874bdd17d Update bootstrap module control plane manifests and type constraints
* Remove unneeded control plane flags that correspond to defaults
* Adopt Terraform v0.12 type constraints in bootstrap module
2019-10-06 21:09:30 -07:00
5b9dab6659 Introduce list of detail objects for bare-metal machines
* Define bare-metal `controllers` and `workers` as a complex type
list(object{name=string, mac=string, domain=string}) to allow
clusters with many machines to be defined more cleanly
* Remove `controller_names` list variable
* Remove `controller_macs` list variable
* Remove `controller_domains` list variable
* Remove `worker_names` list variable
* Remove `worker_macs` list variable
* Remove `worker_domains` list variable
2019-10-06 20:22:45 -07:00
5196709fe0 Update docs, CHANGES, and mkdocs-material
* Update mkdocs-material from v4.4.2 to v4.4.3
* Update recommended Terraform provider versions
* Cleanup the changelog before release
2019-10-06 18:41:25 -07:00
ab72f1ab2d Update Prometheus from v2.12.0 to v2.13.0
* https://github.com/prometheus/prometheus/releases/tag/v2.13.0
2019-10-06 18:22:20 -07:00
5ef4155e08 Detect most recent Fedora CoreOS AMI in region
* Detect the most recent Fedora CoreOS AMI to allow usage
of Fedora CoreOS in supported regions (previously just
us-east-1)
* Unpin the Fedora CoreOS AMI image which was pinned to
images that had been checked. This does mean if Fedora
publishes a broken image, it will be selected
* Filter out "dev" images which have similar naming
2019-10-06 18:13:55 -07:00
15c4b793c3 Use new Fedora CoreOS kernel/initrd/raw asset names
* Fedora CoreOS changed the kernel, initramfs, and raw
image asset download paths and names in 30.20191002.0
2019-10-06 17:31:21 -07:00
36ed53924f Add stricter types for bare-metal modules
* Review variables available in bare-metal kubernetes modules
for Container Linux and Fedora CoreOS
* Deprecate cluster_domain_suffix variable
* Remove deprecated container_linux_oem variable
2019-10-06 17:18:50 -07:00
19de38b30d Fix Prometheus etcd metrics scraping
* Prometheus was configured to use kubernetes discovery
of etcd targets based on nodes matching the node label
node-role.kubernetes.io/controller=true
* Kubernetes v1.16 stopped permitting node role labels
node-role.kubernetes.io/* so Typhoon renamed these labels
(no longer any association with roles) to
node.kubermetes.io/controller=true
* As a result, Prometheus didn't discover etcd targets,
etcd metrics were missing, etcd alerts were ineffective,
and the etcd Grafana dashboard was empty
* Introduced: https://github.com/poseidon/typhoon/pull/543
2019-10-03 19:07:05 -07:00
995824fa6d Add stricter types for DigitalOcean module
* Review variables available in DigitalOcean kubernetes
module and sync with documentation
* Promote Calico for DigitalOcean and Azure beyond experimental
(its the primary mode I've used since it was introduced)
2019-10-02 21:48:24 -07:00
1c5ed84fc2 Update Kubernetes from v1.16.0 to v1.16.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.16.md#v1161
2019-10-02 21:31:55 -07:00
ca7d62720e Update Grafana from v6.3.6 to v6.4.1
* https://github.com/grafana/grafana/releases/tag/v6.4.1
2019-10-02 20:36:05 -07:00
26f8d76755 Update kube-state-metrics from v1.7.2 to v1.8.0
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.8.0
2019-10-01 20:50:33 -07:00
fdd6882a87 Add stricter types to Azure modules
* Review variables available in Azure kubernetes and workers
modules and sync with documentation
* Fix internal workers module default type to Standard_DS1_v2
2019-09-30 22:20:20 -07:00
f82266ac8c Add stricter types for GCP modules
* Review variables available in google-cloud kubernetes
and workers modules and in documentation
2019-09-30 22:04:35 -07:00
7bcf2d7831 Update nginx-ingress from v0.25.1 to v0.26.1
* Add lifecycle hook to allow draining connections for
up to 5 minutes
2019-09-30 22:01:07 -07:00
78bfff0afe Update Fedora CoreOS to testing 30.20190905.0
* Fix duplicated cluster_domain_suffix variable
2019-09-29 11:34:31 -07:00
a6de245d8a Rename bootkube.tf to bootstrap.tf
* Typhoon no longer uses the bootkube project
2019-09-29 11:30:49 -07:00
96afa6a531 Update Calico from v3.8.2 to v3.9.1
* https://docs.projectcalico.org/v3.9/release-notes/
2019-09-29 11:22:53 -07:00
a407ff72df Add stricter types for AWS modules and update docs
* Review variables available in AWS kubernetes and workers
modules and documentation
* Switching between spot and on-demand has worked since
Terraform v0.12
* Generally, there are too many knobs. Less useful ones
should be de-emphasized or removed
* Remove `cluster_domain_suffix` documentation
2019-09-29 11:19:38 -07:00
f453c54956 Update Grafana from v6.3.5 to v6.3.6
* https://github.com/grafana/grafana/releases/tag/v6.3.6
2019-09-28 15:13:46 -07:00
3e34fb075b Update etcd from v3.4.0 to v3.4.1
* https://github.com/etcd-io/etcd/releases/tag/v3.4.1
2019-09-28 15:09:57 -07:00
9bfb1c5faf Update docs and variable types for worker node_labels
* Document worker pools `node_labels` variable to set the
initial node labels for a homogeneous set of workers
* Document `worker_node_labels` convenience variable to
set the initial node labels for default worker nodes
2019-09-28 15:05:12 -07:00
99ab81f79c Add node_labels variable in workers modules to set initial node labels (#550)
* Also add `worker_node_labels` variable in `kubernetes` modules to set
initial node labels for the default workers
2019-09-28 14:59:24 -07:00
8703f2c3c5 Fix missing comma separator on bare-metal and DO
* Introduced in bare-metal and DigitalOcean in #544
while addressing possible ordering race, but after
the v1.16 upgrade validation
2019-09-23 11:05:26 -07:00
97 changed files with 13481 additions and 4630 deletions

View File

@ -4,6 +4,94 @@ Notable changes between versions.
## Latest ## Latest
## v1.16.3
* Kubernetes [v1.16.3](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.16.md#v1163)
* Update etcd from v3.4.2 to v3.4.3 ([#582](https://github.com/poseidon/typhoon/pull/582))
* Upgrade Calico from v3.9.2 to [v3.10.1](https://docs.projectcalico.org/v3.10/release-notes/)
* Allow advertising service ClusterIPs to peer routers via a [BGPConfiguration](https://docs.projectcalico.org/v3.10/networking/advertise-service-ips)
* Switch `kube-proxy` from iptables to ipvs mode ([#574](https://github.com/poseidon/typhoon/pull/574))
#### Addons
* Update Prometheus from v2.13.0 to [v2.14.0](https://github.com/prometheus/prometheus/releases/tag/v2.14.0)
* Refresh rules, alerts, and dashboards from upstreams
* Remove addon-resizer from kube-state-metrics ([#575](https://github.com/poseidon/typhoon/pull/575))
* Update Grafana from v6.4.2 to v6.4.4
## v1.16.2
* Kubernetes [v1.16.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.16.md#v1162)
* Update etcd from v3.4.1 to v3.4.2 ([#570](https://github.com/poseidon/typhoon/pull/570))
* Update Calico from v3.9.1 to [v3.9.2](https://docs.projectcalico.org/v3.9/release-notes/)
* Default to using Calico and supporting NetworkPolicy on all platforms
#### Azure
* Change default networking provider from "flannel" to "calico" ([#573](https://github.com/poseidon/typhoon/pull/573))
#### Bare-Metal
* Add `controllers` and `workers` as typed lists of machine detail objects ([#566](https://github.com/poseidon/typhoon/pull/566))
* Define clusters' machines cleanly and with Terraform v0.12 type constraints (**action required**, see PR example)
* Remove `controller_names`, `controller_macs`, and `controller_domains` variables
* Remove `worker_names`, `worker_macs`, and `worker_domains` variables
#### DigitalOcean
* Change default networking provider from "flannel" to "calico" ([#573](https://github.com/poseidon/typhoon/pull/573))
#### Addons
* Update Grafana from v6.4.1 to [v6.4.2](https://github.com/grafana/grafana/releases/tag/v6.4.2)
* Change CLUO label from "app" to "name"
## v1.16.1
* Kubernetes [v1.16.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.16.md#v1161)
* Update etcd from v3.4.0 to [v3.4.1](https://github.com/etcd-io/etcd/releases/tag/v3.4.1)
* Update Calico from v3.8.2 to [v3.9.1](https://docs.projectcalico.org/v3.9/release-notes/)
* Add Terraform v0.12 variables types ([#553](https://github.com/poseidon/typhoon/pull/553), [#557](https://github.com/poseidon/typhoon/pull/557), [#560](https://github.com/poseidon/typhoon/pull/560), [#556](https://github.com/poseidon/typhoon/pull/556), [#562](https://github.com/poseidon/typhoon/pull/562))
* Deprecate `cluster_domain_suffix` variable
#### AWS
* Add `worker_node_labels` variable to set initial worker node labels ([#550](https://github.com/poseidon/typhoon/pull/550))
* Add `node_labels` variable to internal `workers` pool module ([#550](https://github.com/poseidon/typhoon/pull/550))
* For Fedora CoreOS, detect most recent AMI in the region
#### Azure
* Promote `networking` provider Calico VXLAN out of experimental (set `networking = "calico"`)
* Add `worker_node_labels` variable to set initial worker node labels ([#550](https://github.com/poseidon/typhoon/pull/550))
* Add `node_labels` variable to internal `workers` pool module ([#550](https://github.com/poseidon/typhoon/pull/550))
* Change `workers` module default `vm_type` to `Standard_DS1_v2` (followup to [#539](https://github.com/poseidon/typhoon/pull/539))
#### Bare-Metal
* For Fedora CoreOS, use new kernel, initrd, and raw paths ([#563](https://github.com/poseidon/typhoon/pull/563))
* Fix Terraform missing comma error ([#549](https://github.com/poseidon/typhoon/pull/549))
* Remove deprecated `container_linux_oem` variable ([#562](https://github.com/poseidon/typhoon/pull/562))
#### DigitalOcean
* Promote `networking` provider Calico VXLAN out of experimental (set `networking = "calico"`)
* Fix Terraform missing comma error ([#549](https://github.com/poseidon/typhoon/pull/549))
#### Google Cloud
* Add `worker_node_labels` variable to set initial worker node labels ([#550](https://github.com/poseidon/typhoon/pull/550))
* Add `node_labels` variable to internal `workers` module ([#550](https://github.com/poseidon/typhoon/pull/550))
#### Addons
* Update Prometheus from v2.12.0 to [v2.13.0](https://github.com/prometheus/prometheus/releases/tag/v2.13.0)
* Fix Prometheus etcd target discovery and scraping ([#561](https://github.com/poseidon/typhoon/pull/561), regressed with Kubernetes v1.16.0)
* Update kube-state-metrics from v1.7.2 to v1.8.0
* Update nginx-ingress from v0.25.1 to [v0.26.1](https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.26.1) ([#555](https://github.com/poseidon/typhoon/pull/555))
* Add lifecycle hook to allow draining for up to 5 minutes
* Update Grafana from v6.3.5 to [v6.4.1](https://github.com/grafana/grafana/releases/tag/v6.4.1)
## v1.16.0 ## v1.16.0
* Kubernetes [v1.16.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.16.md#v1160) ([#543](https://github.com/poseidon/typhoon/pull/543)) * Kubernetes [v1.16.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.16.md#v1160) ([#543](https://github.com/poseidon/typhoon/pull/543))

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a> ## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.16.0 (upstream) * Kubernetes v1.16.3 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking * Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [preemptible](https://typhoon.psdn.io/cl/google-cloud/#preemption) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization * Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [preemptible](https://typhoon.psdn.io/cl/google-cloud/#preemption) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
@ -48,7 +48,7 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo
```tf ```tf
module "google-cloud-yavin" { module "google-cloud-yavin" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.16.0" source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.16.3"
# Google Cloud # Google Cloud
cluster_name = "yavin" cluster_name = "yavin"
@ -82,9 +82,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig $ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
$ kubectl get nodes $ kubectl get nodes
NAME ROLES STATUS AGE VERSION NAME ROLES STATUS AGE VERSION
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.16.0 yavin-controller-0.c.example-com.internal <none> Ready 6m v1.16.3
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.16.0 yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.16.3
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.16.0 yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.16.3
``` ```
List the pods. List the pods.

View File

@ -10,11 +10,11 @@ spec:
maxUnavailable: 1 maxUnavailable: 1
selector: selector:
matchLabels: matchLabels:
app: container-linux-update-agent name: container-linux-update-agent
template: template:
metadata: metadata:
labels: labels:
app: container-linux-update-agent name: container-linux-update-agent
annotations: annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default' seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec: spec:

View File

@ -7,11 +7,11 @@ spec:
replicas: 1 replicas: 1
selector: selector:
matchLabels: matchLabels:
app: container-linux-update-operator name: container-linux-update-operator
template: template:
metadata: metadata:
labels: labels:
app: container-linux-update-operator name: container-linux-update-operator
annotations: annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default' seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec: spec:

View File

@ -1,8 +1,4 @@
apiVersion: v1 apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-dashboards-coredns
namespace: monitoring
data: data:
coredns.json: |- coredns.json: |-
{ {
@ -26,7 +22,7 @@ data:
"links": [ "links": [
], ],
"refresh": "", "refresh": "10s",
"rows": [ "rows": [
{ {
"collapse": false, "collapse": false,
@ -976,17 +972,43 @@ data:
"allValue": null, "allValue": null,
"current": { "current": {
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": "cluster",
"multi": false,
"name": "cluster",
"options": [
],
"query": "label_values(kube_pod_info, cluster)",
"refresh": 2,
"regex": "",
"sort": 0,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
}, },
"datasource": "$datasource", "datasource": "$datasource",
"hide": 0, "hide": 0,
"includeAll": true, "includeAll": true,
"label": null, "label": "pod",
"multi": false, "multi": false,
"name": "instance", "name": "instance",
"options": [ "options": [
], ],
"query": "label_values(coredns_build_info{job=\"coredns\"}, instance)", "query": "label_values(coredns_build_info{cluster=\"$cluster\", job=\"coredns\"}, instance)",
"refresh": 2, "refresh": 2,
"regex": "", "regex": "",
"sort": 0, "sort": 0,
@ -1034,3 +1056,7 @@ data:
"uid": "2f3f749259235f58698ea949170d3bd5", "uid": "2f3f749259235f58698ea949170d3bd5",
"version": 0 "version": 0
} }
kind: ConfigMap
metadata:
name: grafana-dashboards-coredns
namespace: monitoring

View File

@ -1,8 +1,4 @@
apiVersion: v1 apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-dashboards-etcd
namespace: monitoring
data: data:
etcd.json: |- etcd.json: |-
{ {
@ -1229,3 +1225,7 @@ data:
"uid": "c2f4e12cdf69feb95caa41a5a1b423d9", "uid": "c2f4e12cdf69feb95caa41a5a1b423d9",
"version": 215 "version": 215
} }
kind: ConfigMap
metadata:
name: grafana-dashboards-etcd
namespace: monitoring

View File

@ -1,8 +1,4 @@
apiVersion: v1 apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-dashboards-k8s-nodes
namespace: monitoring
data: data:
kubelet.json: |- kubelet.json: |-
{ {
@ -92,7 +88,7 @@ data:
"tableColumn": "", "tableColumn": "",
"targets": [ "targets": [
{ {
"expr": "sum(up{job=\"kubelet\"})", "expr": "sum(up{cluster=\"$cluster\", job=\"kubelet\"})",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "", "legendFormat": "",
@ -176,7 +172,7 @@ data:
"tableColumn": "", "tableColumn": "",
"targets": [ "targets": [
{ {
"expr": "sum(kubelet_running_pod_count{job=\"kubelet\", instance=~\"$instance\"})", "expr": "sum(kubelet_running_pod_count{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"})",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}}", "legendFormat": "{{instance}}",
@ -260,7 +256,7 @@ data:
"tableColumn": "", "tableColumn": "",
"targets": [ "targets": [
{ {
"expr": "sum(kubelet_running_container_count{job=\"kubelet\", instance=~\"$instance\"})", "expr": "sum(kubelet_running_container_count{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"})",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}}", "legendFormat": "{{instance}}",
@ -344,7 +340,7 @@ data:
"tableColumn": "", "tableColumn": "",
"targets": [ "targets": [
{ {
"expr": "sum(volume_manager_total_volumes{job=\"kubelet\", instance=~\"$instance\", state=\"actual_state_of_world\"})", "expr": "sum(volume_manager_total_volumes{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\", state=\"actual_state_of_world\"})",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}}", "legendFormat": "{{instance}}",
@ -428,7 +424,7 @@ data:
"tableColumn": "", "tableColumn": "",
"targets": [ "targets": [
{ {
"expr": "sum(volume_manager_total_volumes{job=\"kubelet\", instance=~\"$instance\",state=\"desired_state_of_world\"})", "expr": "sum(volume_manager_total_volumes{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\",state=\"desired_state_of_world\"})",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}}", "legendFormat": "{{instance}}",
@ -512,7 +508,7 @@ data:
"tableColumn": "", "tableColumn": "",
"targets": [ "targets": [
{ {
"expr": "sum(rate(kubelet_node_config_error{job=\"kubelet\", instance=~\"$instance\"}[5m]))", "expr": "sum(rate(kubelet_node_config_error{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"}[5m]))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}}", "legendFormat": "{{instance}}",
@ -592,7 +588,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(rate(kubelet_runtime_operations_total{job=\"kubelet\",instance=~\"$instance\"}[5m])) by (operation_type, instance)", "expr": "sum(rate(kubelet_runtime_operations_total{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])) by (operation_type, instance)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}} {{operation_type}}", "legendFormat": "{{instance}} {{operation_type}}",
@ -683,7 +679,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(rate(kubelet_runtime_operations_errors_total{job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, operation_type)", "expr": "sum(rate(kubelet_runtime_operations_errors_total{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, operation_type)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}} {{operation_type}}", "legendFormat": "{{instance}} {{operation_type}}",
@ -787,7 +783,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "histogram_quantile(0.99, sum(rate(kubelet_runtime_operations_duration_seconds_bucket{job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, operation_type, le))", "expr": "histogram_quantile(0.99, sum(rate(kubelet_runtime_operations_duration_seconds_bucket{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, operation_type, le))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}} {{operation_type}}", "legendFormat": "{{instance}} {{operation_type}}",
@ -891,14 +887,14 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(rate(kubelet_pod_start_duration_seconds_count{job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance)", "expr": "sum(rate(kubelet_pod_start_duration_seconds_count{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}} pod", "legendFormat": "{{instance}} pod",
"refId": "A" "refId": "A"
}, },
{ {
"expr": "sum(rate(kubelet_pod_worker_duration_seconds_count{job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance)", "expr": "sum(rate(kubelet_pod_worker_duration_seconds_count{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}} worker", "legendFormat": "{{instance}} worker",
@ -989,14 +985,14 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "histogram_quantile(0.99, sum(rate(kubelet_pod_start_duration_seconds_count{job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, le))", "expr": "histogram_quantile(0.99, sum(rate(kubelet_pod_start_duration_seconds_count{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, le))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}} pod", "legendFormat": "{{instance}} pod",
"refId": "A" "refId": "A"
}, },
{ {
"expr": "histogram_quantile(0.99, sum(rate(kubelet_pod_worker_duration_seconds_bucket{job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, le))", "expr": "histogram_quantile(0.99, sum(rate(kubelet_pod_worker_duration_seconds_bucket{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, le))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}} worker", "legendFormat": "{{instance}} worker",
@ -1102,7 +1098,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(rate(storage_operation_duration_seconds_count{job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, operation_name, volume_plugin)", "expr": "sum(rate(storage_operation_duration_seconds_count{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, operation_name, volume_plugin)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}} {{operation_name}} {{volume_plugin}}", "legendFormat": "{{instance}} {{operation_name}} {{volume_plugin}}",
@ -1195,7 +1191,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(rate(storage_operation_errors_total{job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, operation_name, volume_plugin)", "expr": "sum(rate(storage_operation_errors_total{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, operation_name, volume_plugin)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}} {{operation_name}} {{volume_plugin}}", "legendFormat": "{{instance}} {{operation_name}} {{volume_plugin}}",
@ -1301,7 +1297,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "histogram_quantile(0.99, sum(rate(storage_operation_duration_seconds_bucket{job=\"kubelet\", instance=~\"$instance\"}[5m])) by (instance, operation_name, volume_plugin, le))", "expr": "histogram_quantile(0.99, sum(rate(storage_operation_duration_seconds_bucket{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"}[5m])) by (instance, operation_name, volume_plugin, le))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}} {{operation_name}} {{volume_plugin}}", "legendFormat": "{{instance}} {{operation_name}} {{volume_plugin}}",
@ -1405,7 +1401,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(rate(kubelet_cgroup_manager_duration_seconds_count{job=\"kubelet\", instance=~\"$instance\"}[5m])) by (instance, operation_type)", "expr": "sum(rate(kubelet_cgroup_manager_duration_seconds_count{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"}[5m])) by (instance, operation_type)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{operation_type}}", "legendFormat": "{{operation_type}}",
@ -1496,7 +1492,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "histogram_quantile(0.99, sum(rate(kubelet_cgroup_manager_duration_seconds_bucket{job=\"kubelet\", instance=~\"$instance\"}[5m])) by (instance, operation_type, le))", "expr": "histogram_quantile(0.99, sum(rate(kubelet_cgroup_manager_duration_seconds_bucket{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"}[5m])) by (instance, operation_type, le))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}} {{operation_type}}", "legendFormat": "{{instance}} {{operation_type}}",
@ -1601,7 +1597,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(rate(kubelet_pleg_relist_duration_seconds_count{job=\"kubelet\", instance=~\"$instance\"}[5m])) by (instance)", "expr": "sum(rate(kubelet_pleg_relist_duration_seconds_count{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"}[5m])) by (instance)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}}", "legendFormat": "{{instance}}",
@ -1692,7 +1688,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "histogram_quantile(0.99, sum(rate(kubelet_pleg_relist_interval_seconds_bucket{job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, le))", "expr": "histogram_quantile(0.99, sum(rate(kubelet_pleg_relist_interval_seconds_bucket{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, le))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}}", "legendFormat": "{{instance}}",
@ -1796,7 +1792,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "histogram_quantile(0.99, sum(rate(kubelet_pleg_relist_duration_seconds_bucket{job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, le))", "expr": "histogram_quantile(0.99, sum(rate(kubelet_pleg_relist_duration_seconds_bucket{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])) by (instance, le))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}}", "legendFormat": "{{instance}}",
@ -1900,28 +1896,28 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(rate(rest_client_requests_total{job=\"kubelet\", instance=~\"$instance\",code=~\"2..\"}[5m]))", "expr": "sum(rate(rest_client_requests_total{cluster=\"$cluster\",job=\"kubelet\", instance=~\"$instance\",code=~\"2..\"}[5m]))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "2xx", "legendFormat": "2xx",
"refId": "A" "refId": "A"
}, },
{ {
"expr": "sum(rate(rest_client_requests_total{job=\"kubelet\", instance=~\"$instance\",code=~\"3..\"}[5m]))", "expr": "sum(rate(rest_client_requests_total{cluster=\"$cluster\",job=\"kubelet\", instance=~\"$instance\",code=~\"3..\"}[5m]))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "3xx", "legendFormat": "3xx",
"refId": "B" "refId": "B"
}, },
{ {
"expr": "sum(rate(rest_client_requests_total{job=\"kubelet\", instance=~\"$instance\",code=~\"4..\"}[5m]))", "expr": "sum(rate(rest_client_requests_total{cluster=\"$cluster\",job=\"kubelet\", instance=~\"$instance\",code=~\"4..\"}[5m]))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "4xx", "legendFormat": "4xx",
"refId": "C" "refId": "C"
}, },
{ {
"expr": "sum(rate(rest_client_requests_total{job=\"kubelet\", instance=~\"$instance\",code=~\"5..\"}[5m]))", "expr": "sum(rate(rest_client_requests_total{cluster=\"$cluster\",job=\"kubelet\", instance=~\"$instance\",code=~\"5..\"}[5m]))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "5xx", "legendFormat": "5xx",
@ -2025,7 +2021,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "histogram_quantile(0.99, sum(rate(rest_client_request_latency_seconds_bucket{job=\"kubelet\", instance=~\"$instance\"}[5m])) by (instance, verb, url, le))", "expr": "histogram_quantile(0.99, sum(rate(rest_client_request_latency_seconds_bucket{cluster=\"$cluster\",job=\"kubelet\", instance=~\"$instance\"}[5m])) by (instance, verb, url, le))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}} {{verb}} {{url}}", "legendFormat": "{{instance}} {{verb}} {{url}}",
@ -2129,7 +2125,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "process_resident_memory_bytes{job=\"kubelet\",instance=~\"$instance\"}", "expr": "process_resident_memory_bytes{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}}", "legendFormat": "{{instance}}",
@ -2220,7 +2216,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "rate(process_cpu_seconds_total{job=\"kubelet\",instance=~\"$instance\"}[5m])", "expr": "rate(process_cpu_seconds_total{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}[5m])",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}}", "legendFormat": "{{instance}}",
@ -2311,7 +2307,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "go_goroutines{job=\"kubelet\",instance=~\"$instance\"}", "expr": "go_goroutines{cluster=\"$cluster\",job=\"kubelet\",instance=~\"$instance\"}",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}}", "legendFormat": "{{instance}}",
@ -2395,6 +2391,32 @@ data:
"allValue": null, "allValue": null,
"current": { "current": {
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": "cluster",
"multi": false,
"name": "cluster",
"options": [
],
"query": "label_values(kube_pod_info, cluster)",
"refresh": 2,
"regex": "",
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
}, },
"datasource": "$datasource", "datasource": "$datasource",
"hide": 0, "hide": 0,
@ -2405,10 +2427,10 @@ data:
"options": [ "options": [
], ],
"query": "label_values(kubelet_runtime_operations{job=\"kubelet\"}, instance)", "query": "label_values(kubelet_runtime_operations{cluster=\"$cluster\", job=\"kubelet\"}, instance)",
"refresh": 2, "refresh": 2,
"regex": "", "regex": "",
"sort": 0, "sort": 1,
"tagValuesQuery": "", "tagValuesQuery": "",
"tags": [ "tags": [
@ -2453,1367 +2475,6 @@ data:
"uid": "3138fa155d5915769fbded898ac09fd9", "uid": "3138fa155d5915769fbded898ac09fd9",
"version": 0 "version": 0
} }
nodes.json: |-
{
"__inputs": [
],
"__requires": [
],
"annotations": {
"list": [
]
},
"editable": false,
"gnetId": null,
"graphTooltip": 0,
"hideControls": false,
"id": null,
"links": [
],
"refresh": "",
"rows": [
{
"collapse": false,
"collapsed": false,
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"gridPos": {
},
"id": 2,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "max(node_load1{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "load 1m",
"refId": "A"
},
{
"expr": "max(node_load5{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "load 5m",
"refId": "B"
},
{
"expr": "max(node_load15{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "load 15m",
"refId": "C"
},
{
"expr": "count(node_cpu_seconds_total{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\", mode=\"user\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "logical cores",
"refId": "D"
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "System load",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"gridPos": {
},
"id": 3,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum by (cpu) (irate(node_cpu_seconds_total{cluster=\"$cluster\", job=\"node-exporter\", mode!=\"idle\", instance=\"$instance\"}[5m]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{cpu}}",
"refId": "A"
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Usage Per Core",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": false,
"title": "Dashboard Row",
"titleSize": "h6",
"type": "row"
},
{
"collapse": false,
"collapsed": false,
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"gridPos": {
},
"id": 4,
"legend": {
"alignAsTable": "true",
"avg": "true",
"current": "true",
"max": "false",
"min": "false",
"rightSide": "true",
"show": "true",
"total": "false",
"values": "true"
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 9,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "max (sum by (cpu) (irate(node_cpu_seconds_total{cluster=\"$cluster\", job=\"node-exporter\", mode!=\"idle\", instance=\"$instance\"}[2m])) ) * 100\n",
"format": "time_series",
"intervalFactor": 10,
"legendFormat": "{{ cpu }}",
"refId": "A"
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "CPU Utilization",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "percent",
"label": null,
"logBase": 1,
"max": 100,
"min": 0,
"show": true
},
{
"format": "percent",
"label": null,
"logBase": 1,
"max": 100,
"min": 0,
"show": true
}
]
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": false,
"colors": [
"rgba(50, 172, 45, 0.97)",
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "$datasource",
"format": "percent",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": true,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
},
"id": 5,
"interval": null,
"links": [
],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"span": 3,
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false
},
"tableColumn": "",
"targets": [
{
"expr": "avg(sum by (cpu) (irate(node_cpu_seconds_total{cluster=\"$cluster\", job=\"node-exporter\", mode!=\"idle\", instance=\"$instance\"}[2m]))) * 100\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "",
"refId": "A"
}
],
"thresholds": "80, 90",
"title": "CPU Usage",
"tooltip": {
"shared": false
},
"type": "singlestat",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "current"
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": false,
"title": "Dashboard Row",
"titleSize": "h6",
"type": "row"
},
{
"collapse": false,
"collapsed": false,
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"gridPos": {
},
"id": 6,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 9,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "max(\n node_memory_MemTotal_bytes{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"}\n - node_memory_MemFree_bytes{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"}\n - node_memory_Buffers_bytes{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"}\n - node_memory_Cached_bytes{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"}\n)\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "memory used",
"refId": "A"
},
{
"expr": "max(node_memory_Buffers_bytes{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "memory buffers",
"refId": "B"
},
{
"expr": "max(node_memory_Cached_bytes{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "memory cached",
"refId": "C"
},
{
"expr": "max(node_memory_MemFree_bytes{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "memory free",
"refId": "D"
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Memory Usage",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "bytes",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "bytes",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
]
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": false,
"colors": [
"rgba(50, 172, 45, 0.97)",
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "$datasource",
"format": "percent",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": true,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
},
"id": 7,
"interval": null,
"links": [
],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"span": 3,
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false
},
"tableColumn": "",
"targets": [
{
"expr": "max(\n (\n (\n node_memory_MemTotal_bytes{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"}\n - node_memory_MemFree_bytes{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"}\n - node_memory_Buffers_bytes{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"}\n - node_memory_Cached_bytes{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"}\n )\n / node_memory_MemTotal_bytes{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"}\n ) * 100)\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "",
"refId": "A"
}
],
"thresholds": "80, 90",
"title": "Memory Usage",
"tooltip": {
"shared": false
},
"type": "singlestat",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "current"
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": false,
"title": "Dashboard Row",
"titleSize": "h6",
"type": "row"
},
{
"collapse": false,
"collapsed": false,
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"gridPos": {
},
"id": 8,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
{
"alias": "read",
"yaxis": 1
},
{
"alias": "io time",
"yaxis": 2
}
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "max(rate(node_disk_read_bytes_total{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"}[2m]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "read",
"refId": "A"
},
{
"expr": "max(rate(node_disk_written_bytes_total{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"}[2m]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "written",
"refId": "B"
},
{
"expr": "max(rate(node_disk_io_time_seconds_total{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"}[2m]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "io time",
"refId": "C"
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Disk I/O",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "bytes",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "ms",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"gridPos": {
},
"id": 9,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "node:node_filesystem_usage:{cluster=\"$cluster\", instance=\"$instance\"}",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{device}} disk used",
"refId": "A"
},
{
"expr": "node:node_filesystem_usage:{cluster=\"$cluster\", instance=\"$instance\"}",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{device}} disk free",
"refId": "B"
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Disk Space Usage",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": false,
"title": "Dashboard Row",
"titleSize": "h6",
"type": "row"
},
{
"collapse": false,
"collapsed": false,
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"gridPos": {
},
"id": 10,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "rate(node_network_receive_bytes_total{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\", device!~\"lo\"}[5m])",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{device}}",
"refId": "A"
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Network Received",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "bytes",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "bytes",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"gridPos": {
},
"id": 11,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "rate(node_network_transmit_bytes_total{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\", device!~\"lo\"}[5m])",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{device}}",
"refId": "A"
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Network Transmitted",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "bytes",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "bytes",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": false,
"title": "Dashboard Row",
"titleSize": "h6",
"type": "row"
},
{
"collapse": false,
"collapsed": false,
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"gridPos": {
},
"id": 12,
"legend": {
"alignAsTable": false,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"repeat": null,
"seriesOverrides": [
],
"spaceLength": 10,
"span": 9,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "max(\n node_filesystem_files{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"}\n - node_filesystem_files_free{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"}\n)\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "inodes used",
"refId": "A"
},
{
"expr": "max(node_filesystem_files_free{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "inodes free",
"refId": "B"
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Inodes Usage",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
]
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": false,
"colors": [
"rgba(50, 172, 45, 0.97)",
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "$datasource",
"format": "percent",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": true,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
},
"id": 13,
"interval": null,
"links": [
],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"span": 3,
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false
},
"tableColumn": "",
"targets": [
{
"expr": "max(\n (\n (\n node_filesystem_files{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"}\n - node_filesystem_files_free{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"}\n )\n / node_filesystem_files{cluster=\"$cluster\", job=\"node-exporter\", instance=\"$instance\"}\n ) * 100)\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "",
"refId": "A"
}
],
"thresholds": "80, 90",
"title": "Inodes Usage",
"tooltip": {
"shared": false
},
"type": "singlestat",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "current"
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": false,
"title": "Dashboard Row",
"titleSize": "h6",
"type": "row"
}
],
"schemaVersion": 14,
"style": "dark",
"tags": [
"kubernetes-mixin"
],
"templating": {
"list": [
{
"current": {
"text": "Prometheus",
"value": "Prometheus"
},
"hide": 0,
"label": null,
"name": "datasource",
"options": [
],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": null,
"current": {
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": "cluster",
"multi": false,
"name": "cluster",
"options": [
],
"query": "label_values(kube_pod_info, cluster)",
"refresh": 2,
"regex": "",
"sort": 0,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": null,
"multi": false,
"name": "instance",
"options": [
],
"query": "label_values(node_boot_time_seconds{cluster=\"$cluster\", job=\"node-exporter\"}, instance)",
"refresh": 2,
"regex": "",
"sort": 0,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "",
"title": "Kubernetes / Nodes",
"uid": "fa49a4706d07a042595b664c87fb33ea",
"version": 0
}
proxy.json: |- proxy.json: |-
{ {
"__inputs": [ "__inputs": [
@ -4958,7 +3619,7 @@ data:
"query": "label_values(kubeproxy_network_programming_duration_seconds_bucket{job=\"kube-proxy\"}, instance)", "query": "label_values(kubeproxy_network_programming_duration_seconds_bucket{job=\"kube-proxy\"}, instance)",
"refresh": 2, "refresh": 2,
"regex": "", "regex": "",
"sort": 0, "sort": 1,
"tagValuesQuery": "", "tagValuesQuery": "",
"tags": [ "tags": [
@ -5003,3 +3664,7 @@ data:
"uid": "632e265de029684c40b21cb76bca4f94", "uid": "632e265de029684c40b21cb76bca4f94",
"version": 0 "version": 0
} }
kind: ConfigMap
metadata:
name: grafana-dashboards-k8s-nodes
namespace: monitoring

View File

@ -0,0 +1,5375 @@
apiVersion: v1
data:
k8s-resources-cluster.json: |-
{
"annotations": {
"list": [
]
},
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"hideControls": false,
"links": [
],
"refresh": "10s",
"rows": [
{
"collapse": false,
"height": "100px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"format": "percentunit",
"id": 1,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 2,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "1 - avg(rate(node_cpu_seconds_total{mode=\"idle\", cluster=\"$cluster\"}[1m]))",
"format": "time_series",
"instant": true,
"intervalFactor": 2,
"refId": "A"
}
],
"thresholds": "70,80",
"timeFrom": null,
"timeShift": null,
"title": "CPU Utilisation",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "singlestat",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"format": "percentunit",
"id": 2,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 2,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\"}) / sum(kube_node_status_allocatable_cpu_cores{cluster=\"$cluster\"})",
"format": "time_series",
"instant": true,
"intervalFactor": 2,
"refId": "A"
}
],
"thresholds": "70,80",
"timeFrom": null,
"timeShift": null,
"title": "CPU Requests Commitment",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "singlestat",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"format": "percentunit",
"id": 3,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 2,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\"}) / sum(kube_node_status_allocatable_cpu_cores{cluster=\"$cluster\"})",
"format": "time_series",
"instant": true,
"intervalFactor": 2,
"refId": "A"
}
],
"thresholds": "70,80",
"timeFrom": null,
"timeShift": null,
"title": "CPU Limits Commitment",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "singlestat",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"format": "percentunit",
"id": 4,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 2,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "1 - sum(:node_memory_MemAvailable_bytes:sum{cluster=\"$cluster\"}) / sum(kube_node_status_allocatable_memory_bytes{cluster=\"$cluster\"})",
"format": "time_series",
"instant": true,
"intervalFactor": 2,
"refId": "A"
}
],
"thresholds": "70,80",
"timeFrom": null,
"timeShift": null,
"title": "Memory Utilisation",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "singlestat",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"format": "percentunit",
"id": 5,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 2,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\"}) / sum(kube_node_status_allocatable_memory_bytes{cluster=\"$cluster\"})",
"format": "time_series",
"instant": true,
"intervalFactor": 2,
"refId": "A"
}
],
"thresholds": "70,80",
"timeFrom": null,
"timeShift": null,
"title": "Memory Requests Commitment",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "singlestat",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"format": "percentunit",
"id": 6,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 2,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\"}) / sum(kube_node_status_allocatable_memory_bytes{cluster=\"$cluster\"})",
"format": "time_series",
"instant": true,
"intervalFactor": 2,
"refId": "A"
}
],
"thresholds": "70,80",
"timeFrom": null,
"timeShift": null,
"title": "Memory Limits Commitment",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "singlestat",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": false,
"title": "Headlines",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 7,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\"}) by (namespace)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{namespace}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "CPU Usage",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "CPU",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 8,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"styles": [
{
"alias": "Time",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "hidden"
},
{
"alias": "Pods",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 0,
"link": true,
"linkTooltip": "Drill down to pods",
"linkUrl": "./d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "Workloads",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 0,
"link": true,
"linkTooltip": "Drill down to workloads",
"linkUrl": "./d/a87fb0d919ec0ea5f6543124e16c42a5/k8s-resources-workloads-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Usage",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Requests",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Requests %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "CPU Limits",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #F",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Limits %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #G",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "Namespace",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": true,
"linkTooltip": "Drill down to pods",
"linkUrl": "./d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell",
"pattern": "namespace",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"pattern": "/.*/",
"thresholds": [
],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"expr": "count(mixin_pod_workload{cluster=\"$cluster\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "count(avg(mixin_pod_workload{cluster=\"$cluster\"}) by (workload, namespace)) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\"}) by (namespace) / sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "F",
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\"}) by (namespace) / sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "G",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "CPU Quota",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"transform": "table",
"type": "table",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "CPU Quota",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 9,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(container_memory_rss{cluster=\"$cluster\", container!=\"\"}) by (namespace)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{namespace}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Memory Usage (w/o cache)",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "bytes",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Memory",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 10,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"styles": [
{
"alias": "Time",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "hidden"
},
{
"alias": "Pods",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 0,
"link": true,
"linkTooltip": "Drill down to pods",
"linkUrl": "./d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "Workloads",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 0,
"link": true,
"linkTooltip": "Drill down to workloads",
"linkUrl": "./d/a87fb0d919ec0ea5f6543124e16c42a5/k8s-resources-workloads-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "Memory Usage",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Requests",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Requests %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "Memory Limits",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #F",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Limits %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #G",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "Namespace",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": true,
"linkTooltip": "Drill down to pods",
"linkUrl": "./d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell",
"pattern": "namespace",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"pattern": "/.*/",
"thresholds": [
],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"expr": "count(mixin_pod_workload{cluster=\"$cluster\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "count(avg(mixin_pod_workload{cluster=\"$cluster\"}) by (workload, namespace)) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "sum(container_memory_rss{cluster=\"$cluster\", container!=\"\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "sum(container_memory_rss{cluster=\"$cluster\", container!=\"\"}) by (namespace) / sum(kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "F",
"step": 10
},
{
"expr": "sum(container_memory_rss{cluster=\"$cluster\", container!=\"\"}) by (namespace) / sum(kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\"}) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "G",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Requests by Namespace",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"transform": "table",
"type": "table",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Memory Requests",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 11,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"styles": [
{
"alias": "Time",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "hidden"
},
{
"alias": "Current Receive Bandwidth",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Current Transmit Bandwidth",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Rate of Received Packets",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Rate of Transmitted Packets",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Rate of Received Packets Dropped",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Rate of Transmitted Packets Dropped",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #F",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Namespace",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": true,
"linkTooltip": "Drill down to pods",
"linkUrl": "./d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell",
"pattern": "namespace",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"pattern": "/.*/",
"thresholds": [
],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$interval])) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$interval])) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$interval])) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$interval])) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$interval])) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
},
{
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$interval])) by (namespace)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "F",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Current Network Usage",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"transform": "table",
"type": "table",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 12,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$interval])) by (namespace)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{namespace}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Receive Bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 13,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$interval])) by (namespace)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{namespace}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Transmit Bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 14,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$interval])) by (namespace)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{namespace}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Average Container Bandwidth by Namespace: Received",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 15,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$interval])) by (namespace)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{namespace}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Average Container Bandwidth by Namespace: Transmitted",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 16,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$interval])) by (namespace)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{namespace}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Received Packets",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 17,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$interval])) by (namespace)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{namespace}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Transmitted Packets",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 18,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$interval])) by (namespace)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{namespace}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Received Packets Dropped",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 19,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$interval])) by (namespace)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{namespace}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Transmitted Packets Dropped",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
}
],
"schemaVersion": 14,
"style": "dark",
"tags": [
"kubernetes-mixin"
],
"templating": {
"list": [
{
"current": {
"text": "Prometheus",
"value": "Prometheus"
},
"hide": 0,
"label": null,
"name": "datasource",
"options": [
],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": "cluster",
"multi": false,
"name": "cluster",
"options": [
],
"query": "label_values(node_cpu_seconds_total, cluster)",
"refresh": 1,
"regex": "",
"sort": 2,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "5m",
"value": "5m"
},
"datasource": "prometheus",
"hide": 2,
"includeAll": false,
"label": null,
"multi": false,
"name": "interval",
"options": [
{
"selected": true,
"text": "4h",
"value": "4h"
}
],
"query": "4h",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "interval",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "",
"title": "Kubernetes / Compute Resources / Cluster",
"uid": "efa86fd1d0c121a26444b636a3f509a8",
"version": 0
}
k8s-resources-namespace.json: |-
{
"annotations": {
"list": [
]
},
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"hideControls": false,
"links": [
],
"refresh": "10s",
"rows": [
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 1,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "CPU Usage",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "CPU Usage",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 2,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"styles": [
{
"alias": "Time",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "hidden"
},
{
"alias": "CPU Usage",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Requests",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Requests %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "CPU Limits",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Limits %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "Pod",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": true,
"linkTooltip": "Drill down",
"linkUrl": "./d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"pattern": "pod",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"pattern": "/.*/",
"thresholds": [
],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod) / sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod) / sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "CPU Quota",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"transform": "table",
"type": "table",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "CPU Quota",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 3,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Memory Usage (w/o cache)",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "bytes",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Memory Usage",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 4,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"styles": [
{
"alias": "Time",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "hidden"
},
{
"alias": "Memory Usage",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Requests",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Requests %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "Memory Limits",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Limits %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "Memory Usage (RSS)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #F",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Usage (Cache)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #G",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Usage (Swap)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #H",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Pod",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": true,
"linkTooltip": "Drill down",
"linkUrl": "./d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"pattern": "pod",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"pattern": "/.*/",
"thresholds": [
],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) by (pod) / sum(kube_pod_container_resource_requests_memory_bytes{namespace=\"$namespace\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) by (pod) / sum(kube_pod_container_resource_limits_memory_bytes{namespace=\"$namespace\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
},
{
"expr": "sum(container_memory_rss{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "F",
"step": 10
},
{
"expr": "sum(container_memory_cache{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "G",
"step": 10
},
{
"expr": "sum(container_memory_swap{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "H",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Memory Quota",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"transform": "table",
"type": "table",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Memory Quota",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 5,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"styles": [
{
"alias": "Time",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "hidden"
},
{
"alias": "Current Receive Bandwidth",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Current Transmit Bandwidth",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Rate of Received Packets",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Rate of Transmitted Packets",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Rate of Received Packets Dropped",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Rate of Transmitted Packets Dropped",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #F",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Pod",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": true,
"linkTooltip": "Drill down to pods",
"linkUrl": "./d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"pattern": "pod",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"pattern": "/.*/",
"thresholds": [
],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
},
{
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "F",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Current Network Usage",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"transform": "table",
"type": "table",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 6,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Receive Bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 7,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Transmit Bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 8,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Received Packets",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 9,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Transmitted Packets",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 10,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Received Packets Dropped",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 11,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Transmitted Packets Dropped",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
}
],
"schemaVersion": 14,
"style": "dark",
"tags": [
"kubernetes-mixin"
],
"templating": {
"list": [
{
"current": {
"text": "Prometheus",
"value": "Prometheus"
},
"hide": 0,
"label": null,
"name": "datasource",
"options": [
],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": "cluster",
"multi": false,
"name": "cluster",
"options": [
],
"query": "label_values(kube_pod_info, cluster)",
"refresh": 1,
"regex": "",
"sort": 2,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": "namespace",
"multi": false,
"name": "namespace",
"options": [
],
"query": "label_values(kube_pod_info{cluster=\"$cluster\"}, namespace)",
"refresh": 1,
"regex": "",
"sort": 2,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "5m",
"value": "5m"
},
"datasource": "prometheus",
"hide": 2,
"includeAll": false,
"label": null,
"multi": false,
"name": "interval",
"options": [
{
"selected": true,
"text": "4h",
"value": "4h"
}
],
"query": "4h",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "interval",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "",
"title": "Kubernetes / Compute Resources / Namespace (Pods)",
"uid": "85a562078cdf77779eaa1add43ccec1e",
"version": 0
}
k8s-resources-node.json: |-
{
"annotations": {
"list": [
]
},
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"hideControls": false,
"links": [
],
"refresh": "10s",
"rows": [
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 1,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "CPU Usage",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "CPU Usage",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 2,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"styles": [
{
"alias": "Time",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "hidden"
},
{
"alias": "CPU Usage",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Requests",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Requests %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "CPU Limits",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Limits %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "Pod",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "pod",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"pattern": "/.*/",
"thresholds": [
],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", node=\"$node\"}) by (pod) / sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", node=\"$node\"}) by (pod) / sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "CPU Quota",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"transform": "table",
"type": "table",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "CPU Quota",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 3,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster=\"$cluster\", node=\"$node\", container!=\"\"}) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Memory Usage (w/o cache)",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "bytes",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Memory Usage",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 4,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"styles": [
{
"alias": "Time",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "hidden"
},
{
"alias": "Memory Usage",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Requests",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Requests %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "Memory Limits",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Limits %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "Memory Usage (RSS)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #F",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Usage (Cache)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #G",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Usage (Swap)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #H",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Pod",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "pod",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"pattern": "/.*/",
"thresholds": [
],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"expr": "sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster=\"$cluster\", node=\"$node\",container!=\"\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster=\"$cluster\", node=\"$node\",container!=\"\"}) by (pod) / sum(kube_pod_container_resource_requests_memory_bytes{node=\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster=\"$cluster\", node=\"$node\",container!=\"\"}) by (pod) / sum(kube_pod_container_resource_limits_memory_bytes{node=\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_memory_rss{cluster=\"$cluster\", node=\"$node\",container!=\"\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "F",
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_memory_cache{cluster=\"$cluster\", node=\"$node\",container!=\"\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "G",
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_memory_swap{cluster=\"$cluster\", node=\"$node\",container!=\"\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "H",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Memory Quota",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"transform": "table",
"type": "table",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Memory Quota",
"titleSize": "h6"
}
],
"schemaVersion": 14,
"style": "dark",
"tags": [
"kubernetes-mixin"
],
"templating": {
"list": [
{
"current": {
"text": "Prometheus",
"value": "Prometheus"
},
"hide": 0,
"label": null,
"name": "datasource",
"options": [
],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": "cluster",
"multi": false,
"name": "cluster",
"options": [
],
"query": "label_values(kube_pod_info, cluster)",
"refresh": 1,
"regex": "",
"sort": 2,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": "node",
"multi": false,
"name": "node",
"options": [
],
"query": "label_values(kube_pod_info{cluster=\"$cluster\"}, node)",
"refresh": 1,
"regex": "",
"sort": 2,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "",
"title": "Kubernetes / Compute Resources / Node (Pods)",
"uid": "200ac8fdbfbb74b39aff88118e4d1c2c",
"version": 0
}
kind: ConfigMap
metadata:
name: grafana-dashboards-k8s-resources-1
namespace: monitoring

View File

@ -0,0 +1,5753 @@
apiVersion: v1
data:
k8s-resources-pod.json: |-
{
"annotations": {
"list": [
]
},
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"hideControls": false,
"links": [
],
"refresh": "10s",
"rows": [
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 1,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", cluster=\"$cluster\"}) by (container)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{container}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "CPU Usage",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "CPU Usage",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 2,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"styles": [
{
"alias": "Time",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "hidden"
},
{
"alias": "CPU Usage",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Requests",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Requests %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "CPU Limits",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Limits %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "Container",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "container",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"pattern": "/.*/",
"thresholds": [
],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container) / sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container) / sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "CPU Quota",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"transform": "table",
"type": "table",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "CPU Quota",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 3,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(container_memory_rss{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container!=\"\"}) by (container)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{container}} (RSS)",
"legendLink": null,
"step": 10
},
{
"expr": "sum(container_memory_cache{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container!=\"\"}) by (container)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{container}} (Cache)",
"legendLink": null,
"step": 10
},
{
"expr": "sum(container_memory_swap{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container!=\"\"}) by (container)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{container}} (Swap)",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Memory Usage",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "bytes",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Memory Usage",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 4,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"styles": [
{
"alias": "Time",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "hidden"
},
{
"alias": "Memory Usage",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Requests",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Requests %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "Memory Limits",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Limits %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "Memory Usage (RSS)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #F",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Usage (Cache)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #G",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Usage (Swap)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #H",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Container",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "container",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"pattern": "/.*/",
"thresholds": [
],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container!=\"\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container) / sum(kube_pod_container_resource_requests_memory_bytes{namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"\"}) by (container) / sum(kube_pod_container_resource_limits_memory_bytes{namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
},
{
"expr": "sum(container_memory_rss{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container != \"\", container != \"POD\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "F",
"step": 10
},
{
"expr": "sum(container_memory_cache{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container != \"\", container != \"POD\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "G",
"step": 10
},
{
"expr": "sum(container_memory_swap{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container != \"\", container != \"POD\"}) by (container)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "H",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Memory Quota",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"transform": "table",
"type": "table",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Memory Quota",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 5,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_bytes_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Receive Bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 6,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_transmit_bytes_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Transmit Bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 7,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_packets_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Received Packets",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 8,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_transmit_packets_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Transmitted Packets",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 9,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_receive_packets_dropped_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Received Packets Dropped",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 10,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(container_network_transmit_packets_dropped_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$interval])) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Transmitted Packets Dropped",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
}
],
"schemaVersion": 14,
"style": "dark",
"tags": [
"kubernetes-mixin"
],
"templating": {
"list": [
{
"current": {
"text": "Prometheus",
"value": "Prometheus"
},
"hide": 0,
"label": null,
"name": "datasource",
"options": [
],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": "cluster",
"multi": false,
"name": "cluster",
"options": [
],
"query": "label_values(kube_pod_info, cluster)",
"refresh": 1,
"regex": "",
"sort": 2,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": "namespace",
"multi": false,
"name": "namespace",
"options": [
],
"query": "label_values(kube_pod_info{cluster=\"$cluster\"}, namespace)",
"refresh": 1,
"regex": "",
"sort": 2,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": "pod",
"multi": false,
"name": "pod",
"options": [
],
"query": "label_values(kube_pod_info{cluster=\"$cluster\", namespace=\"$namespace\"}, pod)",
"refresh": 1,
"regex": "",
"sort": 2,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "5m",
"value": "5m"
},
"datasource": "prometheus",
"hide": 2,
"includeAll": false,
"label": null,
"multi": false,
"name": "interval",
"options": [
{
"selected": true,
"text": "4h",
"value": "4h"
}
],
"query": "4h",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "interval",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "",
"title": "Kubernetes / Compute Resources / Pod",
"uid": "6581e46e4e5c7ba40a07646395ef7b23",
"version": 0
}
k8s-resources-workload.json: |-
{
"annotations": {
"list": [
]
},
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"hideControls": false,
"links": [
],
"refresh": "10s",
"rows": [
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 1,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "CPU Usage",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "CPU Usage",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 2,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"styles": [
{
"alias": "Time",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "hidden"
},
{
"alias": "CPU Usage",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Requests",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Requests %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "CPU Limits",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Limits %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "Pod",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": true,
"linkTooltip": "Drill down",
"linkUrl": "./d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"pattern": "pod",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"pattern": "/.*/",
"thresholds": [
],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "sum(\n kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "sum(\n kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "CPU Quota",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"transform": "table",
"type": "table",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "CPU Quota",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 3,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Memory Usage",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "bytes",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Memory Usage",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 4,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"styles": [
{
"alias": "Time",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "hidden"
},
{
"alias": "Memory Usage",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Requests",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Requests %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "Memory Limits",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Limits %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "Pod",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": true,
"linkTooltip": "Drill down",
"linkUrl": "./d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"pattern": "pod",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"pattern": "/.*/",
"thresholds": [
],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "sum(\n kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "sum(\n kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Memory Quota",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"transform": "table",
"type": "table",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Memory Quota",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 5,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"styles": [
{
"alias": "Time",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "hidden"
},
{
"alias": "Current Receive Bandwidth",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Current Transmit Bandwidth",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Rate of Received Packets",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Rate of Transmitted Packets",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Rate of Received Packets Dropped",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Rate of Transmitted Packets Dropped",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #F",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Pod",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": true,
"linkTooltip": "Drill down",
"linkUrl": "./d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"pattern": "pod",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"pattern": "/.*/",
"thresholds": [
],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
},
{
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "F",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Current Network Usage",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"transform": "table",
"type": "table",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 6,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Receive Bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 7,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Transmit Bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 8,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Average Container Bandwidth by Pod: Received",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 9,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Average Container Bandwidth by Pod: Transmitted",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 10,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Received Packets",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 11,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Transmitted Packets",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 12,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Received Packets Dropped",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 13,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod) \ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Transmitted Packets Dropped",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
}
],
"schemaVersion": 14,
"style": "dark",
"tags": [
"kubernetes-mixin"
],
"templating": {
"list": [
{
"current": {
"text": "Prometheus",
"value": "Prometheus"
},
"hide": 0,
"label": null,
"name": "datasource",
"options": [
],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": "cluster",
"multi": false,
"name": "cluster",
"options": [
],
"query": "label_values(kube_pod_info, cluster)",
"refresh": 1,
"regex": "",
"sort": 2,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": "namespace",
"multi": false,
"name": "namespace",
"options": [
],
"query": "label_values(kube_pod_info{cluster=\"$cluster\"}, namespace)",
"refresh": 1,
"regex": "",
"sort": 2,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": "workload",
"multi": false,
"name": "workload",
"options": [
],
"query": "label_values(mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}, workload)",
"refresh": 1,
"regex": "",
"sort": 2,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": "type",
"multi": false,
"name": "type",
"options": [
],
"query": "label_values(mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\"}, workload_type)",
"refresh": 1,
"regex": "",
"sort": 2,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "5m",
"value": "5m"
},
"datasource": "prometheus",
"hide": 2,
"includeAll": false,
"label": null,
"multi": false,
"name": "interval",
"options": [
{
"selected": true,
"text": "4h",
"value": "4h"
}
],
"query": "4h",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "interval",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "",
"title": "Kubernetes / Compute Resources / Workload",
"uid": "a164a7f0339f99e89cea5cb47e9be617",
"version": 0
}
k8s-resources-workloads-namespace.json: |-
{
"annotations": {
"list": [
]
},
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"hideControls": false,
"links": [
],
"refresh": "10s",
"rows": [
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 1,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{workload}} - {{workload_type}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "CPU Usage",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "CPU Usage",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 2,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"styles": [
{
"alias": "Time",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "hidden"
},
{
"alias": "Running Pods",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 0,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Usage",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Requests",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Requests %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "CPU Limits",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Limits %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #F",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "Workload",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": true,
"linkTooltip": "Drill down",
"linkUrl": "./d/a164a7f0339f99e89cea5cb47e9be617/k8s-resources-workload?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-workload=$__cell&var-type=$__cell_2",
"pattern": "workload",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "Workload Type",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "workload_type",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"pattern": "/.*/",
"thresholds": [
],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"expr": "count(mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}) by (workload, workload_type)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "sum(\n kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "sum(\n kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
},
{
"expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "F",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "CPU Quota",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"transform": "table",
"type": "table",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "CPU Quota",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 3,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{workload}} - {{workload_type}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Memory Usage",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "bytes",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Memory Usage",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 4,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"styles": [
{
"alias": "Time",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "hidden"
},
{
"alias": "Running Pods",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 0,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "Memory Usage",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Requests",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Requests %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "Memory Limits",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Limits %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #F",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "Workload",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": true,
"linkTooltip": "Drill down",
"linkUrl": "./d/a164a7f0339f99e89cea5cb47e9be617/k8s-resources-workload?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-workload=$__cell&var-type=$__cell_2",
"pattern": "workload",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "Workload Type",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "workload_type",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"pattern": "/.*/",
"thresholds": [
],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"expr": "count(mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}) by (workload, workload_type)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "sum(\n kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "sum(\n kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
},
{
"expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "F",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Memory Quota",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"transform": "table",
"type": "table",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Memory Quota",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 5,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"styles": [
{
"alias": "Time",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "hidden"
},
{
"alias": "Current Receive Bandwidth",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Current Transmit Bandwidth",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "Bps"
},
{
"alias": "Rate of Received Packets",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Rate of Transmitted Packets",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Rate of Received Packets Dropped",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Rate of Transmitted Packets Dropped",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #F",
"thresholds": [
],
"type": "number",
"unit": "pps"
},
{
"alias": "Workload",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": true,
"linkTooltip": "Drill down to pods",
"linkUrl": "./d/a164a7f0339f99e89cea5cb47e9be617/k8s-resources-workload?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-workload=$__cell&var-type=$type",
"pattern": "workload",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "Workload Type",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "workload_type",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"pattern": "/.*/",
"thresholds": [
],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
},
{
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "F",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Current Network Usage",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"transform": "table",
"type": "table",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 6,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{workload}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Receive Bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 7,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{workload}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Transmit Bandwidth",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 8,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{workload}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Average Container Bandwidth by Workload: Received",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 9,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{workload}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Average Container Bandwidth by Workload: Transmitted",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 10,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{workload}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Received Packets",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 11,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{workload}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Transmitted Packets",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 12,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{workload}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Received Packets Dropped",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 13,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$interval])\n* on (namespace,pod) \ngroup_left(workload,workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{workload}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Rate of Transmitted Packets Dropped",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
}
],
"schemaVersion": 14,
"style": "dark",
"tags": [
"kubernetes-mixin"
],
"templating": {
"list": [
{
"current": {
"text": "Prometheus",
"value": "Prometheus"
},
"hide": 0,
"label": null,
"name": "datasource",
"options": [
],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": "cluster",
"multi": false,
"name": "cluster",
"options": [
],
"query": "label_values(kube_pod_info, cluster)",
"refresh": 1,
"regex": "",
"sort": 2,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": "namespace",
"multi": false,
"name": "namespace",
"options": [
],
"query": "label_values(kube_pod_info{cluster=\"$cluster\"}, namespace)",
"refresh": 1,
"regex": "",
"sort": 2,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "5m",
"value": "5m"
},
"datasource": "prometheus",
"hide": 2,
"includeAll": false,
"label": null,
"multi": false,
"name": "interval",
"options": [
{
"selected": true,
"text": "4h",
"value": "4h"
}
],
"query": "4h",
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "interval",
"useTags": false
},
{
"allValue": null,
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "deployment",
"value": "deployment"
},
"datasource": "$datasource",
"definition": "label_values(mixin_pod_workload{namespace=~\"$namespace\", workload=~\".+\"}, workload_type)",
"hide": 0,
"includeAll": false,
"label": null,
"multi": false,
"name": "type",
"options": [
],
"query": "label_values(mixin_pod_workload{namespace=~\"$namespace\", workload=~\".+\"}, workload_type)",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "",
"title": "Kubernetes / Compute Resources / Namespace (Workloads)",
"uid": "a87fb0d919ec0ea5f6543124e16c42a5",
"version": 0
}
kind: ConfigMap
metadata:
name: grafana-dashboards-k8s-resources-2
namespace: monitoring

View File

@ -1,1910 +1,5 @@
apiVersion: v1 apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-dashboards-k8s-resources
namespace: monitoring
data: data:
k8s-cluster-rsrc-use.json: |-
{
"annotations": {
"list": [
]
},
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"hideControls": false,
"links": [
],
"refresh": "10s",
"rows": [
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 1,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "node:cluster_cpu_utilisation:ratio{cluster=\"$cluster\"}",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{node}}",
"legendLink": "/d/4ac4f123aae0ff6dbaf4f4f66120033b/k8s-node-rsrc-use",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "CPU Utilisation",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": 1,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 2,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "node:node_cpu_saturation_load1:{cluster=\"$cluster\"} / scalar(sum(min(kube_pod_info{cluster=\"$cluster\"}) by (node)))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{node}}",
"legendLink": "/d/4ac4f123aae0ff6dbaf4f4f66120033b/k8s-node-rsrc-use",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "CPU Saturation (Load1)",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": 1,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "CPU",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 3,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "node:cluster_memory_utilisation:ratio{cluster=\"$cluster\"}",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{node}}",
"legendLink": "/d/4ac4f123aae0ff6dbaf4f4f66120033b/k8s-node-rsrc-use",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Memory Utilisation",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": 1,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 4,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "node:node_memory_swap_io_bytes:sum_rate{cluster=\"$cluster\"}",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{node}}",
"legendLink": "/d/4ac4f123aae0ff6dbaf4f4f66120033b/k8s-node-rsrc-use",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Memory Saturation (Swap I/O)",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Memory",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 5,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "node:node_disk_utilisation:avg_irate{cluster=\"$cluster\"} / scalar(:kube_pod_info_node_count:{cluster=\"$cluster\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{node}}",
"legendLink": "/d/4ac4f123aae0ff6dbaf4f4f66120033b/k8s-node-rsrc-use",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Disk IO Utilisation",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": 1,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 6,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "node:node_disk_saturation:avg_irate{cluster=\"$cluster\"} / scalar(:kube_pod_info_node_count:{cluster=\"$cluster\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{node}}",
"legendLink": "/d/4ac4f123aae0ff6dbaf4f4f66120033b/k8s-node-rsrc-use",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Disk IO Saturation",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": 1,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Disk",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 7,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "node:node_net_utilisation:sum_irate{cluster=\"$cluster\"}",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{node}}",
"legendLink": "/d/4ac4f123aae0ff6dbaf4f4f66120033b/k8s-node-rsrc-use",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Net Utilisation (Transmitted)",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 8,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "node:node_net_saturation:sum_irate{cluster=\"$cluster\"}",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{node}}",
"legendLink": "/d/4ac4f123aae0ff6dbaf4f4f66120033b/k8s-node-rsrc-use",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Net Saturation (Dropped)",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Network",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 9,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(max(node_filesystem_size_bytes{fstype=~\"ext[234]|btrfs|xfs|zfs\", cluster=\"$cluster\"} - node_filesystem_avail_bytes{fstype=~\"ext[234]|btrfs|xfs|zfs\", cluster=\"$cluster\"}) by (device,pod,namespace)) by (pod,namespace)\n/ scalar(sum(max(node_filesystem_size_bytes{fstype=~\"ext[234]|btrfs|xfs|zfs\", cluster=\"$cluster\"}) by (device,pod,namespace)))\n* on (namespace, pod) group_left (node) node_namespace_pod:kube_pod_info:{cluster=\"$cluster\"}\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{node}}",
"legendLink": "/d/4ac4f123aae0ff6dbaf4f4f66120033b/k8s-node-rsrc-use",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Disk Capacity",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": 1,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Storage",
"titleSize": "h6"
}
],
"schemaVersion": 14,
"style": "dark",
"tags": [
"kubernetes-mixin"
],
"templating": {
"list": [
{
"current": {
"text": "Prometheus",
"value": "Prometheus"
},
"hide": 0,
"label": null,
"name": "datasource",
"options": [
],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": "cluster",
"multi": false,
"name": "cluster",
"options": [
],
"query": "label_values(kube_node_info, cluster)",
"refresh": 1,
"regex": "",
"sort": 2,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "",
"title": "Kubernetes / USE Method / Cluster",
"uid": "a6e7d1362e1ddbb79db21d5bb40d7137",
"version": 0
}
k8s-node-rsrc-use.json: |-
{
"annotations": {
"list": [
]
},
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"hideControls": false,
"links": [
],
"refresh": "10s",
"rows": [
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 1,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "node:node_cpu_utilisation:avg1m{cluster=\"$cluster\", node=\"$node\"}",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "Utilisation",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "CPU Utilisation",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 2,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "node:node_cpu_saturation_load1:{cluster=\"$cluster\", node=\"$node\"}",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "Saturation",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "CPU Saturation (Load1)",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "CPU",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 3,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "node:node_memory_utilisation:{cluster=\"$cluster\", node=\"$node\"}",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "Memory",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Memory Utilisation",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 4,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "node:node_memory_swap_io_bytes:sum_rate{cluster=\"$cluster\", node=\"$node\"}",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "Swap IO",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Memory Saturation (Swap I/O)",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Memory",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 5,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "node:node_disk_utilisation:avg_irate{cluster=\"$cluster\", node=\"$node\"}",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "Utilisation",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Disk IO Utilisation",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 6,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "node:node_disk_saturation:avg_irate{cluster=\"$cluster\", node=\"$node\"}",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "Saturation",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Disk IO Saturation",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Disk",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 7,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "node:node_net_utilisation:sum_irate{cluster=\"$cluster\", node=\"$node\"}",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "Utilisation",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Net Utilisation (Transmitted)",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
},
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 8,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 6,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "node:node_net_saturation:sum_irate{cluster=\"$cluster\", node=\"$node\"}",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "Saturation",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Net Saturation (Dropped)",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "Bps",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Net",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 9,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "node:node_filesystem_usage:{cluster=\"$cluster\"}\n* on (namespace, pod) group_left (node) node_namespace_pod:kube_pod_info:{cluster=\"$cluster\", node=\"$node\"}\n",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{device}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Disk Utilisation",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "percentunit",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Disk",
"titleSize": "h6"
}
],
"schemaVersion": 14,
"style": "dark",
"tags": [
"kubernetes-mixin"
],
"templating": {
"list": [
{
"current": {
"text": "Prometheus",
"value": "Prometheus"
},
"hide": 0,
"label": null,
"name": "datasource",
"options": [
],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": "cluster",
"multi": false,
"name": "cluster",
"options": [
],
"query": "label_values(kube_node_info, cluster)",
"refresh": 1,
"regex": "",
"sort": 2,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": "node",
"multi": false,
"name": "node",
"options": [
],
"query": "label_values(kube_node_info{cluster=\"$cluster\"}, node)",
"refresh": 1,
"regex": "",
"sort": 2,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "",
"title": "Kubernetes / USE Method / Node",
"uid": "4ac4f123aae0ff6dbaf4f4f66120033b",
"version": 0
}
k8s-resources-cluster.json: |- k8s-resources-cluster.json: |-
{ {
"annotations": { "annotations": {
@ -2479,7 +574,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\"}) by (namespace)", "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\"}) by (namespace)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{namespace}}", "legendFormat": "{{namespace}}",
@ -2592,7 +687,7 @@ data:
"decimals": 0, "decimals": 0,
"link": true, "link": true,
"linkTooltip": "Drill down to pods", "linkTooltip": "Drill down to pods",
"linkUrl": "/d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1", "linkUrl": "./d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1",
"pattern": "Value #A", "pattern": "Value #A",
"thresholds": [ "thresholds": [
@ -2610,7 +705,7 @@ data:
"decimals": 0, "decimals": 0,
"link": true, "link": true,
"linkTooltip": "Drill down to workloads", "linkTooltip": "Drill down to workloads",
"linkUrl": "/d/a87fb0d919ec0ea5f6543124e16c42a5/k8s-resources-workloads-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1", "linkUrl": "./d/a87fb0d919ec0ea5f6543124e16c42a5/k8s-resources-workloads-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1",
"pattern": "Value #B", "pattern": "Value #B",
"thresholds": [ "thresholds": [
@ -2718,7 +813,7 @@ data:
"decimals": 2, "decimals": 2,
"link": true, "link": true,
"linkTooltip": "Drill down to pods", "linkTooltip": "Drill down to pods",
"linkUrl": "/d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell", "linkUrl": "./d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell",
"pattern": "namespace", "pattern": "namespace",
"thresholds": [ "thresholds": [
@ -2762,7 +857,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\"}) by (namespace)", "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\"}) by (namespace)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -2780,7 +875,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\"}) by (namespace) / sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\"}) by (namespace)", "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\"}) by (namespace) / sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\"}) by (namespace)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -2798,7 +893,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\"}) by (namespace) / sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\"}) by (namespace)", "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\"}) by (namespace) / sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\"}) by (namespace)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -3011,7 +1106,7 @@ data:
"decimals": 0, "decimals": 0,
"link": true, "link": true,
"linkTooltip": "Drill down to pods", "linkTooltip": "Drill down to pods",
"linkUrl": "/d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1", "linkUrl": "./d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1",
"pattern": "Value #A", "pattern": "Value #A",
"thresholds": [ "thresholds": [
@ -3029,7 +1124,7 @@ data:
"decimals": 0, "decimals": 0,
"link": true, "link": true,
"linkTooltip": "Drill down to workloads", "linkTooltip": "Drill down to workloads",
"linkUrl": "/d/a87fb0d919ec0ea5f6543124e16c42a5/k8s-resources-workloads-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1", "linkUrl": "./d/a87fb0d919ec0ea5f6543124e16c42a5/k8s-resources-workloads-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1",
"pattern": "Value #B", "pattern": "Value #B",
"thresholds": [ "thresholds": [
@ -3137,7 +1232,7 @@ data:
"decimals": 2, "decimals": 2,
"link": true, "link": true,
"linkTooltip": "Drill down to pods", "linkTooltip": "Drill down to pods",
"linkUrl": "/d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell", "linkUrl": "./d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell",
"pattern": "namespace", "pattern": "namespace",
"thresholds": [ "thresholds": [
@ -3420,7 +1515,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)", "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -3623,7 +1718,7 @@ data:
"decimals": 2, "decimals": 2,
"link": true, "link": true,
"linkTooltip": "Drill down", "linkTooltip": "Drill down",
"linkUrl": "/d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell", "linkUrl": "./d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"pattern": "pod", "pattern": "pod",
"thresholds": [ "thresholds": [
@ -3649,7 +1744,7 @@ data:
], ],
"targets": [ "targets": [
{ {
"expr": "sum(namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)", "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -3667,7 +1762,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod) / sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)", "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod) / sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -3685,7 +1780,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod) / sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)", "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod) / sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}) by (pod)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -3785,7 +1880,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(container_memory_usage_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}) by (pod)", "expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}) by (pod)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -4015,7 +2110,7 @@ data:
"unit": "bytes" "unit": "bytes"
}, },
{ {
"alias": "Memory Usage (Swap", "alias": "Memory Usage (Swap)",
"colorMode": null, "colorMode": null,
"colors": [ "colors": [
@ -4042,7 +2137,7 @@ data:
"decimals": 2, "decimals": 2,
"link": true, "link": true,
"linkTooltip": "Drill down", "linkTooltip": "Drill down",
"linkUrl": "/d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell", "linkUrl": "./d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"pattern": "pod", "pattern": "pod",
"thresholds": [ "thresholds": [
@ -4068,7 +2163,7 @@ data:
], ],
"targets": [ "targets": [
{ {
"expr": "sum(container_memory_usage_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) by (pod)", "expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) by (pod)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -4086,7 +2181,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(container_memory_usage_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) by (pod) / sum(kube_pod_container_resource_requests_memory_bytes{namespace=\"$namespace\"}) by (pod)", "expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) by (pod) / sum(kube_pod_container_resource_requests_memory_bytes{namespace=\"$namespace\"}) by (pod)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -4104,7 +2199,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(container_memory_usage_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) by (pod) / sum(kube_pod_container_resource_limits_memory_bytes{namespace=\"$namespace\"}) by (pod)", "expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) by (pod) / sum(kube_pod_container_resource_limits_memory_bytes{namespace=\"$namespace\"}) by (pod)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -4303,6 +2398,947 @@ data:
"uid": "85a562078cdf77779eaa1add43ccec1e", "uid": "85a562078cdf77779eaa1add43ccec1e",
"version": 0 "version": 0
} }
k8s-resources-node.json: |-
{
"annotations": {
"list": [
]
},
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"hideControls": false,
"links": [
],
"refresh": "10s",
"rows": [
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 1,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "CPU Usage",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "CPU Usage",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 2,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"styles": [
{
"alias": "Time",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "hidden"
},
{
"alias": "CPU Usage",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Requests",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Requests %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "CPU Limits",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "CPU Limits %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "Pod",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "pod",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"pattern": "/.*/",
"thresholds": [
],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", node=\"$node\"}) by (pod) / sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", node=\"$node\"}) by (pod) / sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "CPU Quota",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"transform": "table",
"type": "table",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "CPU Quota",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 10,
"id": 3,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 0,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": true,
"steppedLine": false,
"targets": [
{
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", node=\"$node\", container!=\"\"}) by (pod)",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "{{pod}}",
"legendLink": null,
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Memory Usage (w/o cache)",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "bytes",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Memory Usage",
"titleSize": "h6"
},
{
"collapse": false,
"height": "250px",
"panels": [
{
"aliasColors": {
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fill": 1,
"id": 4,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [
],
"nullPointMode": "null as zero",
"percentage": false,
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
],
"spaceLength": 10,
"span": 12,
"stack": false,
"steppedLine": false,
"styles": [
{
"alias": "Time",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "hidden"
},
{
"alias": "Memory Usage",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #A",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Requests",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #B",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Requests %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #C",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "Memory Limits",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #D",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Limits %",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #E",
"thresholds": [
],
"type": "number",
"unit": "percentunit"
},
{
"alias": "Memory Usage (RSS)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #F",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Usage (Cache)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #G",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Memory Usage (Swap)",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "Value #H",
"thresholds": [
],
"type": "number",
"unit": "bytes"
},
{
"alias": "Pod",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"link": false,
"linkTooltip": "Drill down",
"linkUrl": "",
"pattern": "pod",
"thresholds": [
],
"type": "number",
"unit": "short"
},
{
"alias": "",
"colorMode": null,
"colors": [
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"pattern": "/.*/",
"thresholds": [
],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", node=\"$node\",container!=\"\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "A",
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "B",
"step": 10
},
{
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", node=\"$node\",container!=\"\"}) by (pod) / sum(kube_pod_container_resource_requests_memory_bytes{node=\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "C",
"step": 10
},
{
"expr": "sum(kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", node=\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "D",
"step": 10
},
{
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", node=\"$node\",container!=\"\"}) by (pod) / sum(kube_pod_container_resource_limits_memory_bytes{node=\"$node\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "E",
"step": 10
},
{
"expr": "sum(container_memory_rss{cluster=\"$cluster\", node=\"$node\",container!=\"\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "F",
"step": 10
},
{
"expr": "sum(container_memory_cache{cluster=\"$cluster\", node=\"$node\",container!=\"\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "G",
"step": 10
},
{
"expr": "sum(container_memory_swap{cluster=\"$cluster\", node=\"$node\",container!=\"\"}) by (pod)",
"format": "table",
"instant": true,
"intervalFactor": 2,
"legendFormat": "",
"refId": "H",
"step": 10
}
],
"thresholds": [
],
"timeFrom": null,
"timeShift": null,
"title": "Memory Quota",
"tooltip": {
"shared": false,
"sort": 0,
"value_type": "individual"
},
"transform": "table",
"type": "table",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [
]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": 0,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
]
}
],
"repeat": null,
"repeatIteration": null,
"repeatRowId": null,
"showTitle": true,
"title": "Memory Quota",
"titleSize": "h6"
}
],
"schemaVersion": 14,
"style": "dark",
"tags": [
"kubernetes-mixin"
],
"templating": {
"list": [
{
"current": {
"text": "Prometheus",
"value": "Prometheus"
},
"hide": 0,
"label": null,
"name": "datasource",
"options": [
],
"query": "prometheus",
"refresh": 1,
"regex": "",
"type": "datasource"
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 2,
"includeAll": false,
"label": "cluster",
"multi": false,
"name": "cluster",
"options": [
],
"query": "label_values(kube_pod_info, cluster)",
"refresh": 1,
"regex": "",
"sort": 2,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {
"text": "prod",
"value": "prod"
},
"datasource": "$datasource",
"hide": 0,
"includeAll": false,
"label": "node",
"multi": false,
"name": "node",
"options": [
],
"query": "label_values(kube_pod_info{cluster=\"$cluster\"}, node)",
"refresh": 1,
"regex": "",
"sort": 2,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "",
"title": "Kubernetes / Compute Resources / Node (Pods)",
"uid": "200ac8fdbfbb74b39aff88118e4d1c2c",
"version": 0
}
k8s-resources-pod.json: |- k8s-resources-pod.json: |-
{ {
"annotations": { "annotations": {
@ -4361,7 +3397,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", cluster=\"$cluster\"}) by (container)", "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", cluster=\"$cluster\"}) by (container)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{container}}", "legendFormat": "{{container}}",
@ -4590,7 +3626,7 @@ data:
], ],
"targets": [ "targets": [
{ {
"expr": "sum(namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\"}) by (container)", "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\"}) by (container)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -4608,7 +3644,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container) / sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)", "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container) / sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -4626,7 +3662,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container) / sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)", "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container) / sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -4972,7 +4008,7 @@ data:
"unit": "bytes" "unit": "bytes"
}, },
{ {
"alias": "Memory Usage (Swap", "alias": "Memory Usage (Swap)",
"colorMode": null, "colorMode": null,
"colors": [ "colors": [
@ -5025,7 +4061,7 @@ data:
], ],
"targets": [ "targets": [
{ {
"expr": "sum(container_memory_usage_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container!=\"\"}) by (container)", "expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container!=\"\"}) by (container)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -5043,7 +4079,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(container_memory_usage_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container) / sum(kube_pod_container_resource_requests_memory_bytes{namespace=\"$namespace\", pod=\"$pod\"}) by (container)", "expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}) by (container) / sum(kube_pod_container_resource_requests_memory_bytes{namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -5061,7 +4097,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(container_memory_usage_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"\"}) by (container) / sum(kube_pod_container_resource_limits_memory_bytes{namespace=\"$namespace\", pod=\"$pod\"}) by (container)", "expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"\"}) by (container) / sum(kube_pod_container_resource_limits_memory_bytes{namespace=\"$namespace\", pod=\"$pod\"}) by (container)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -5345,7 +4381,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(\n namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n", "expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -5548,7 +4584,7 @@ data:
"decimals": 2, "decimals": 2,
"link": true, "link": true,
"linkTooltip": "Drill down", "linkTooltip": "Drill down",
"linkUrl": "/d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell", "linkUrl": "./d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"pattern": "pod", "pattern": "pod",
"thresholds": [ "thresholds": [
@ -5574,7 +4610,7 @@ data:
], ],
"targets": [ "targets": [
{ {
"expr": "sum(\n namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n", "expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -5592,7 +4628,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(\n namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n", "expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -5610,7 +4646,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(\n namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n", "expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -5710,7 +4746,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(\n container_memory_usage_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n", "expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -5913,7 +4949,7 @@ data:
"decimals": 2, "decimals": 2,
"link": true, "link": true,
"linkTooltip": "Drill down", "linkTooltip": "Drill down",
"linkUrl": "/d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell", "linkUrl": "./d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell",
"pattern": "pod", "pattern": "pod",
"thresholds": [ "thresholds": [
@ -5939,7 +4975,7 @@ data:
], ],
"targets": [ "targets": [
{ {
"expr": "sum(\n container_memory_usage_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n", "expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -5957,7 +4993,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(\n container_memory_usage_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n", "expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -5975,7 +5011,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(\n container_memory_usage_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n", "expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -6259,7 +5295,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(\n namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n", "expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{workload}} - {{workload_type}}", "legendFormat": "{{workload}} - {{workload_type}}",
@ -6480,7 +5516,7 @@ data:
"decimals": 2, "decimals": 2,
"link": true, "link": true,
"linkTooltip": "Drill down", "linkTooltip": "Drill down",
"linkUrl": "/d/a164a7f0339f99e89cea5cb47e9be617/k8s-resources-workload?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-workload=$__cell&var-type=$__cell_2", "linkUrl": "./d/a164a7f0339f99e89cea5cb47e9be617/k8s-resources-workload?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-workload=$__cell&var-type=$__cell_2",
"pattern": "workload", "pattern": "workload",
"thresholds": [ "thresholds": [
@ -6533,7 +5569,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(\n namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n", "expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -6551,7 +5587,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(\n namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n", "expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -6569,7 +5605,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(\n namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n", "expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -6669,7 +5705,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(\n container_memory_usage_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n", "expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{workload}} - {{workload_type}}", "legendFormat": "{{workload}} - {{workload_type}}",
@ -6890,7 +5926,7 @@ data:
"decimals": 2, "decimals": 2,
"link": true, "link": true,
"linkTooltip": "Drill down", "linkTooltip": "Drill down",
"linkUrl": "/d/a164a7f0339f99e89cea5cb47e9be617/k8s-resources-workload?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-workload=$__cell&var-type=$__cell_2", "linkUrl": "./d/a164a7f0339f99e89cea5cb47e9be617/k8s-resources-workload?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-workload=$__cell&var-type=$__cell_2",
"pattern": "workload", "pattern": "workload",
"thresholds": [ "thresholds": [
@ -6943,7 +5979,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(\n container_memory_usage_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n", "expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -6961,7 +5997,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(\n container_memory_usage_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n", "expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -6979,7 +6015,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(\n container_memory_usage_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n", "expr": "sum(\n container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", container!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}\n) by (workload, workload_type)\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -7151,3 +6187,7 @@ data:
"uid": "a87fb0d919ec0ea5f6543124e16c42a5", "uid": "a87fb0d919ec0ea5f6543124e16c42a5",
"version": 0 "version": 0
} }
kind: ConfigMap
metadata:
name: grafana-dashboards-k8s-resources
namespace: monitoring

View File

@ -1,8 +1,4 @@
apiVersion: v1 apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-dashboards-k8s
namespace: monitoring
data: data:
apiserver.json: |- apiserver.json: |-
{ {
@ -1240,7 +1236,7 @@ data:
"query": "label_values(apiserver_request_total{job=\"apiserver\"}, instance)", "query": "label_values(apiserver_request_total{job=\"apiserver\"}, instance)",
"refresh": 2, "refresh": 2,
"regex": "", "regex": "",
"sort": 0, "sort": 1,
"tagValuesQuery": "", "tagValuesQuery": "",
"tags": [ "tags": [
@ -2351,7 +2347,7 @@ data:
"query": "label_values(process_cpu_seconds_total{job=\"kube-controller-manager\"}, instance)", "query": "label_values(process_cpu_seconds_total{job=\"kube-controller-manager\"}, instance)",
"refresh": 2, "refresh": 2,
"regex": "", "regex": "",
"sort": 0, "sort": 1,
"tagValuesQuery": "", "tagValuesQuery": "",
"tags": [ "tags": [
@ -2850,7 +2846,7 @@ data:
"query": "label_values(kubelet_volume_stats_capacity_bytes, cluster)", "query": "label_values(kubelet_volume_stats_capacity_bytes, cluster)",
"refresh": 2, "refresh": 2,
"regex": "", "regex": "",
"sort": 0, "sort": 1,
"tagValuesQuery": "", "tagValuesQuery": "",
"tags": [ "tags": [
@ -2876,7 +2872,7 @@ data:
"query": "label_values(kubelet_volume_stats_capacity_bytes{cluster=\"$cluster\", job=\"kubelet\"}, namespace)", "query": "label_values(kubelet_volume_stats_capacity_bytes{cluster=\"$cluster\", job=\"kubelet\"}, namespace)",
"refresh": 2, "refresh": 2,
"regex": "", "regex": "",
"sort": 0, "sort": 1,
"tagValuesQuery": "", "tagValuesQuery": "",
"tags": [ "tags": [
@ -2902,7 +2898,7 @@ data:
"query": "label_values(kubelet_volume_stats_capacity_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\"}, persistentvolumeclaim)", "query": "label_values(kubelet_volume_stats_capacity_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\"}, persistentvolumeclaim)",
"refresh": 2, "refresh": 2,
"regex": "", "regex": "",
"sort": 0, "sort": 1,
"tagValuesQuery": "", "tagValuesQuery": "",
"tags": [ "tags": [
@ -3052,7 +3048,7 @@ data:
"refId": "C" "refId": "C"
}, },
{ {
"expr": "sum by(container) (container_memory_cache{job=\"kubernetes-cadvisor\", namespace=\"$namespace\", pod=~\"$pod\", container=~\"$container\", container!=\"POD\"})", "expr": "sum by(container) (container_memory_cache{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$pod\", container=~\"$container\", container!=\"POD\"})",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "Cache: {{ container }}", "legendFormat": "Cache: {{ container }}",
@ -3482,7 +3478,7 @@ data:
"query": "label_values(kube_pod_info, cluster)", "query": "label_values(kube_pod_info, cluster)",
"refresh": 2, "refresh": 2,
"regex": "", "regex": "",
"sort": 0, "sort": 1,
"tagValuesQuery": "", "tagValuesQuery": "",
"tags": [ "tags": [
@ -3508,7 +3504,7 @@ data:
"query": "label_values(kube_pod_info{cluster=\"$cluster\"}, namespace)", "query": "label_values(kube_pod_info{cluster=\"$cluster\"}, namespace)",
"refresh": 2, "refresh": 2,
"regex": "", "regex": "",
"sort": 0, "sort": 1,
"tagValuesQuery": "", "tagValuesQuery": "",
"tags": [ "tags": [
@ -3534,7 +3530,7 @@ data:
"query": "label_values(kube_pod_info{cluster=\"$cluster\", namespace=~\"$namespace\"}, pod)", "query": "label_values(kube_pod_info{cluster=\"$cluster\", namespace=~\"$namespace\"}, pod)",
"refresh": 2, "refresh": 2,
"regex": "", "regex": "",
"sort": 0, "sort": 1,
"tagValuesQuery": "", "tagValuesQuery": "",
"tags": [ "tags": [
@ -3560,7 +3556,7 @@ data:
"query": "label_values(kube_pod_container_info{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}, container)", "query": "label_values(kube_pod_container_info{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}, container)",
"refresh": 2, "refresh": 2,
"regex": "", "regex": "",
"sort": 0, "sort": 1,
"tagValuesQuery": "", "tagValuesQuery": "",
"tags": [ "tags": [
@ -4596,7 +4592,7 @@ data:
"query": "label_values(process_cpu_seconds_total{job=\"kube-scheduler\"}, instance)", "query": "label_values(process_cpu_seconds_total{job=\"kube-scheduler\"}, instance)",
"refresh": 2, "refresh": 2,
"regex": "", "regex": "",
"sort": 0, "sort": 1,
"tagValuesQuery": "", "tagValuesQuery": "",
"tags": [ "tags": [
@ -5448,7 +5444,7 @@ data:
"query": "label_values(kube_statefulset_metadata_generation, cluster)", "query": "label_values(kube_statefulset_metadata_generation, cluster)",
"refresh": 2, "refresh": 2,
"regex": "", "regex": "",
"sort": 0, "sort": 1,
"tagValuesQuery": "", "tagValuesQuery": "",
"tags": [ "tags": [
@ -5471,10 +5467,10 @@ data:
"options": [ "options": [
], ],
"query": "label_values(kube_statefulset_metadata_generation{job=\"kube-state-metrics\"}, namespace)", "query": "label_values(kube_statefulset_metadata_generation{job=\"kube-state-metrics\", cluster=\"$cluster\"}, namespace)",
"refresh": 2, "refresh": 2,
"regex": "", "regex": "",
"sort": 0, "sort": 1,
"tagValuesQuery": "", "tagValuesQuery": "",
"tags": [ "tags": [
@ -5497,10 +5493,10 @@ data:
"options": [ "options": [
], ],
"query": "label_values(kube_statefulset_metadata_generation{job=\"kube-state-metrics\", namespace=\"$namespace\"}, statefulset)", "query": "label_values(kube_statefulset_metadata_generation{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\"}, statefulset)",
"refresh": 2, "refresh": 2,
"regex": "", "regex": "",
"sort": 0, "sort": 1,
"tagValuesQuery": "", "tagValuesQuery": "",
"tags": [ "tags": [
@ -5545,3 +5541,7 @@ data:
"uid": "a31c1f46e6f727cb37c0d731a7245005", "uid": "a31c1f46e6f727cb37c0d731a7245005",
"version": 0 "version": 0
} }
kind: ConfigMap
metadata:
name: grafana-dashboards-k8s
namespace: monitoring

View File

@ -1,8 +1,4 @@
apiVersion: v1 apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-dashboards-nginx-ingress
namespace: monitoring
data: data:
nginx.json: |- nginx.json: |-
{ {
@ -26,7 +22,7 @@ data:
"links": [ "links": [
], ],
"refresh": "", "refresh": "10s",
"rows": [ "rows": [
{ {
"collapse": false, "collapse": false,
@ -94,7 +90,7 @@ data:
"tableColumn": "", "tableColumn": "",
"targets": [ "targets": [
{ {
"expr": "round(sum(irate(nginx_ingress_controller_requests{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",namespace=~\"$namespace\"}[2m])), 0.01)", "expr": "round(sum(irate(nginx_ingress_controller_requests{cluster=~\"$cluster\", controller_pod=~\"$controller\",controller_class=~\"$controller_class\", controller_namespace=~\"$namespace\"}[2m])), 0.01)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "", "legendFormat": "",
@ -176,7 +172,7 @@ data:
"tableColumn": "", "tableColumn": "",
"targets": [ "targets": [
{ {
"expr": "sum(avg_over_time(nginx_ingress_controller_nginx_process_connections{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\"}[2m]))", "expr": "sum(avg_over_time(nginx_ingress_controller_nginx_process_connections{cluster=~\"$cluster\", controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\"}[2m]))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "", "legendFormat": "",
@ -258,7 +254,7 @@ data:
"tableColumn": "", "tableColumn": "",
"targets": [ "targets": [
{ {
"expr": "sum(rate(nginx_ingress_controller_requests{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",namespace=~\"$namespace\",status!~\"[4-5].*\"}[2m])) / sum(rate(nginx_ingress_controller_requests{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",namespace=~\"$namespace\"}[2m]))", "expr": "sum(rate(nginx_ingress_controller_requests{cluster=~\"$cluster\", controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\",status!~\"[4-5].*\"}[2m])) / sum(rate(nginx_ingress_controller_requests{cluster=~\"$cluster\", controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\"}[2m]))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "", "legendFormat": "",
@ -335,7 +331,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "round(sum(irate(nginx_ingress_controller_requests{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\",ingress=~\"$ingress\"}[2m])) by (ingress), 0.01)", "expr": "round(sum(irate(nginx_ingress_controller_requests{cluster=~\"$cluster\", controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\",ingress=~\"$ingress\"}[2m])) by (ingress), 0.01)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{ingress}}", "legendFormat": "{{ingress}}",
@ -426,7 +422,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(rate(nginx_ingress_controller_requests{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",namespace=~\"$namespace\",ingress=~\"$ingress\",status!~\"[4-5].*\"}[2m])) by (ingress) / sum(rate(nginx_ingress_controller_requests{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",namespace=~\"$namespace\",ingress=~\"$ingress\"}[2m])) by (ingress)", "expr": "sum(rate(nginx_ingress_controller_requests{cluster=~\"$cluster\", controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\",ingress=~\"$ingress\",status!~\"[4-5].*\"}[2m])) by (ingress) / sum(rate(nginx_ingress_controller_requests{cluster=~\"$cluster\", controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\",ingress=~\"$ingress\"}[2m])) by (ingress)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{ingress}}", "legendFormat": "{{ingress}}",
@ -530,21 +526,21 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "histogram_quantile(0.99, sum(rate(nginx_ingress_controller_request_duration_seconds_bucket{ingress!=\"\",controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\",ingress=~\"$ingress\"}[2m])) by (le, ingress))", "expr": "histogram_quantile(0.99, sum(rate(nginx_ingress_controller_request_duration_seconds_bucket{cluster=~\"$cluster\", ingress!=\"\",controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\",ingress=~\"$ingress\"}[2m])) by (le, ingress))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{ingress}} 99%", "legendFormat": "{{ingress}} 99%",
"refId": "A" "refId": "A"
}, },
{ {
"expr": "histogram_quantile(0.90, sum(rate(nginx_ingress_controller_request_duration_seconds_bucket{ingress!=\"\",controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\",ingress=~\"$ingress\"}[2m])) by (le, ingress))", "expr": "histogram_quantile(0.90, sum(rate(nginx_ingress_controller_request_duration_seconds_bucket{cluster=~\"$cluster\", ingress!=\"\",controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\",ingress=~\"$ingress\"}[2m])) by (le, ingress))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{ingress}} 90%", "legendFormat": "{{ingress}} 90%",
"refId": "B" "refId": "B"
}, },
{ {
"expr": "histogram_quantile(0.50, sum(rate(nginx_ingress_controller_request_duration_seconds_bucket{ingress!=\"\",controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\",ingress=~\"$ingress\"}[2m])) by (le, ingress))", "expr": "histogram_quantile(0.50, sum(rate(nginx_ingress_controller_request_duration_seconds_bucket{cluster=~\"$cluster\", ingress!=\"\",controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\",ingress=~\"$ingress\"}[2m])) by (le, ingress))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{ingress}} 50%", "legendFormat": "{{ingress}} 50%",
@ -648,14 +644,14 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum (irate (nginx_ingress_controller_request_size_sum{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\"}[2m]))", "expr": "sum (irate (nginx_ingress_controller_request_size_sum{cluster=~\"$cluster\", controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\"}[2m]))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "received", "legendFormat": "received",
"refId": "A" "refId": "A"
}, },
{ {
"expr": "sum (irate (nginx_ingress_controller_response_size_sum{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\"}[2m]))", "expr": "sum (irate (nginx_ingress_controller_response_size_sum{cluster=~\"$cluster\", controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\"}[2m]))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "sent", "legendFormat": "sent",
@ -746,7 +742,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "avg(nginx_ingress_controller_nginx_process_resident_memory_bytes{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\"}) by (controller_pod)", "expr": "avg(nginx_ingress_controller_nginx_process_resident_memory_bytes{cluster=~\"$cluster\", controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\"}) by (controller_pod)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{controller_pod}}", "legendFormat": "{{controller_pod}}",
@ -837,7 +833,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(rate(nginx_ingress_controller_nginx_process_cpu_seconds_total{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\"}[2m])) by (controller_pod)", "expr": "sum(rate(nginx_ingress_controller_nginx_process_cpu_seconds_total{cluster=~\"$cluster\", controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\"}[2m])) by (controller_pod)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{controller_pod}}", "legendFormat": "{{controller_pod}}",
@ -921,6 +917,32 @@ data:
"allValue": ".*", "allValue": ".*",
"current": { "current": {
},
"datasource": "$datasource",
"hide": 2,
"includeAll": true,
"label": "cluster",
"multi": false,
"name": "cluster",
"options": [
],
"query": "label_values(kube_pod_info, cluster)",
"refresh": 2,
"regex": "",
"sort": 0,
"tagValuesQuery": "",
"tags": [
],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": ".*",
"current": {
}, },
"datasource": "$datasource", "datasource": "$datasource",
"hide": 0, "hide": 0,
@ -931,7 +953,7 @@ data:
"options": [ "options": [
], ],
"query": "label_values(nginx_ingress_controller_config_hash, controller_namespace)", "query": "label_values(nginx_ingress_controller_config_hash{cluster=~\"$cluster\"}, controller_namespace)",
"refresh": 2, "refresh": 2,
"regex": "", "regex": "",
"sort": 0, "sort": 0,
@ -957,7 +979,7 @@ data:
"options": [ "options": [
], ],
"query": "label_values(nginx_ingress_controller_config_hash{namespace=~\"$namespace\"}, controller_class)", "query": "label_values(nginx_ingress_controller_config_hash{cluster=~\"$cluster\", namespace=~\"$namespace\"}, controller_class)",
"refresh": 2, "refresh": 2,
"regex": "", "regex": "",
"sort": 0, "sort": 0,
@ -983,7 +1005,7 @@ data:
"options": [ "options": [
], ],
"query": "label_values(nginx_ingress_controller_config_hash{namespace=~\"$namespace\",controller_class=~\"$controller_class\"}, controller_pod)", "query": "label_values(nginx_ingress_controller_config_hash{cluster=~\"$cluster\", namespace=~\"$namespace\", controller_class=~\"$controller_class\"}, controller_pod)",
"refresh": 2, "refresh": 2,
"regex": "", "regex": "",
"sort": 0, "sort": 0,
@ -1009,7 +1031,7 @@ data:
"options": [ "options": [
], ],
"query": "label_values(nginx_ingress_controller_requests{namespace=~\"$namespace\",controller_class=~\"$controller_class\",controller=~\"$controller\"}, ingress)", "query": "label_values(nginx_ingress_controller_requests{cluster=~\"$cluster\", namespace=~\"$namespace\", controller_class=~\"$controller_class\", controller=~\"$controller\"}, ingress)",
"refresh": 2, "refresh": 2,
"regex": "", "regex": "",
"sort": 0, "sort": 0,
@ -1057,3 +1079,7 @@ data:
"uid": "f4af03eca476c08ecf2b5cf15fd60168", "uid": "f4af03eca476c08ecf2b5cf15fd60168",
"version": 0 "version": 0
} }
kind: ConfigMap
metadata:
name: grafana-dashboards-nginx-ingress
namespace: monitoring

View File

@ -1,8 +1,4 @@
apiVersion: v1 apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-dashboards-prom
namespace: monitoring
data: data:
prometheus-remote-write.json: |- prometheus-remote-write.json: |-
{ {
@ -2158,3 +2154,7 @@ data:
"uid": "", "uid": "",
"version": 0 "version": 0
} }
kind: ConfigMap
metadata:
name: grafana-dashboards-prom
namespace: monitoring

View File

@ -23,7 +23,7 @@ spec:
spec: spec:
containers: containers:
- name: grafana - name: grafana
image: docker.io/grafana/grafana:6.3.5 image: docker.io/grafana/grafana:6.4.4
env: env:
- name: GF_PATHS_CONFIG - name: GF_PATHS_CONFIG
value: "/etc/grafana/custom.ini" value: "/etc/grafana/custom.ini"

View File

@ -22,7 +22,7 @@ spec:
spec: spec:
containers: containers:
- name: nginx-ingress-controller - name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.25.1 image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.26.1
args: args:
- /nginx-ingress-controller - /nginx-ingress-controller
- --ingress-class=public - --ingress-class=public
@ -65,6 +65,11 @@ spec:
periodSeconds: 10 periodSeconds: 10
successThreshold: 1 successThreshold: 1
timeoutSeconds: 5 timeoutSeconds: 5
lifecycle:
preStop:
exec:
command:
- /wait-shutdown
securityContext: securityContext:
capabilities: capabilities:
add: add:
@ -73,4 +78,4 @@ spec:
- ALL - ALL
runAsUser: 33 # www-data runAsUser: 33 # www-data
restartPolicy: Always restartPolicy: Always
terminationGracePeriodSeconds: 60 terminationGracePeriodSeconds: 300

View File

@ -22,7 +22,7 @@ spec:
spec: spec:
containers: containers:
- name: nginx-ingress-controller - name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.25.1 image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.26.1
args: args:
- /nginx-ingress-controller - /nginx-ingress-controller
- --ingress-class=public - --ingress-class=public
@ -65,6 +65,11 @@ spec:
periodSeconds: 10 periodSeconds: 10
successThreshold: 1 successThreshold: 1
timeoutSeconds: 5 timeoutSeconds: 5
lifecycle:
preStop:
exec:
command:
- /wait-shutdown
securityContext: securityContext:
capabilities: capabilities:
add: add:
@ -73,4 +78,4 @@ spec:
- ALL - ALL
runAsUser: 33 # www-data runAsUser: 33 # www-data
restartPolicy: Always restartPolicy: Always
terminationGracePeriodSeconds: 60 terminationGracePeriodSeconds: 300

View File

@ -22,7 +22,7 @@ spec:
spec: spec:
containers: containers:
- name: nginx-ingress-controller - name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.25.1 image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.26.1
args: args:
- /nginx-ingress-controller - /nginx-ingress-controller
- --ingress-class=public - --ingress-class=public
@ -62,6 +62,11 @@ spec:
successThreshold: 1 successThreshold: 1
failureThreshold: 3 failureThreshold: 3
timeoutSeconds: 5 timeoutSeconds: 5
lifecycle:
preStop:
exec:
command:
- /wait-shutdown
securityContext: securityContext:
capabilities: capabilities:
add: add:
@ -70,5 +75,5 @@ spec:
- ALL - ALL
runAsUser: 33 # www-data runAsUser: 33 # www-data
restartPolicy: Always restartPolicy: Always
terminationGracePeriodSeconds: 60 terminationGracePeriodSeconds: 300

View File

@ -22,7 +22,7 @@ spec:
spec: spec:
containers: containers:
- name: nginx-ingress-controller - name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.25.1 image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.26.1
args: args:
- /nginx-ingress-controller - /nginx-ingress-controller
- --ingress-class=public - --ingress-class=public
@ -65,6 +65,11 @@ spec:
periodSeconds: 10 periodSeconds: 10
successThreshold: 1 successThreshold: 1
timeoutSeconds: 5 timeoutSeconds: 5
lifecycle:
preStop:
exec:
command:
- /wait-shutdown
securityContext: securityContext:
capabilities: capabilities:
add: add:
@ -73,4 +78,4 @@ spec:
- ALL - ALL
runAsUser: 33 # www-data runAsUser: 33 # www-data
restartPolicy: Always restartPolicy: Always
terminationGracePeriodSeconds: 60 terminationGracePeriodSeconds: 300

View File

@ -22,7 +22,7 @@ spec:
spec: spec:
containers: containers:
- name: nginx-ingress-controller - name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.25.1 image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.26.1
args: args:
- /nginx-ingress-controller - /nginx-ingress-controller
- --ingress-class=public - --ingress-class=public
@ -65,6 +65,11 @@ spec:
periodSeconds: 10 periodSeconds: 10
successThreshold: 1 successThreshold: 1
timeoutSeconds: 5 timeoutSeconds: 5
lifecycle:
preStop:
exec:
command:
- /wait-shutdown
securityContext: securityContext:
capabilities: capabilities:
add: add:
@ -73,4 +78,4 @@ spec:
- ALL - ALL
runAsUser: 33 # www-data runAsUser: 33 # www-data
restartPolicy: Always restartPolicy: Always
terminationGracePeriodSeconds: 60 terminationGracePeriodSeconds: 300

View File

@ -115,7 +115,7 @@ data:
- role: node - role: node
scheme: http scheme: http
relabel_configs: relabel_configs:
- source_labels: [__meta_kubernetes_node_label_node_role_kubernetes_io_controller] - source_labels: [__meta_kubernetes_node_label_node_kubernetes_io_controller]
action: keep action: keep
regex: 'true' regex: 'true'
- action: labelmap - action: labelmap

View File

@ -20,7 +20,7 @@ spec:
serviceAccountName: prometheus serviceAccountName: prometheus
containers: containers:
- name: prometheus - name: prometheus
image: quay.io/prometheus/prometheus:v2.12.0 image: quay.io/prometheus/prometheus:v2.14.0
args: args:
- --web.listen-address=0.0.0.0:9090 - --web.listen-address=0.0.0.0:9090
- --config.file=/etc/prometheus/prometheus.yaml - --config.file=/etc/prometheus/prometheus.yaml

View File

@ -24,7 +24,7 @@ spec:
serviceAccountName: kube-state-metrics serviceAccountName: kube-state-metrics
containers: containers:
- name: kube-state-metrics - name: kube-state-metrics
image: quay.io/coreos/kube-state-metrics:v1.7.2 image: quay.io/coreos/kube-state-metrics:v1.8.0
ports: ports:
- name: metrics - name: metrics
containerPort: 8080 containerPort: 8080
@ -37,33 +37,6 @@ spec:
readinessProbe: readinessProbe:
httpGet: httpGet:
path: / path: /
port: 8080 port: 8081
initialDelaySeconds: 5 initialDelaySeconds: 5
timeoutSeconds: 5 timeoutSeconds: 5
- name: addon-resizer
image: k8s.gcr.io/addon-resizer:1.8.5
resources:
limits:
cpu: 100m
memory: 30Mi
requests:
cpu: 100m
memory: 30Mi
env:
- name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: MY_POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
command:
- /pod_nanny
- --container=kube-state-metrics
- --cpu=100m
- --extra-cpu=1m
- --memory=100Mi
- --extra-memory=2Mi
- --threshold=5
- --deployment=kube-state-metrics

View File

@ -1,13 +0,0 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: kube-state-metrics
namespace: monitoring
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: kube-state-metrics
subjects:
- kind: ServiceAccount
name: kube-state-metrics
namespace: monitoring

View File

@ -1,31 +0,0 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: kube-state-metrics
namespace: monitoring
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- extensions
resources:
- deployments
resourceNames:
- kube-state-metrics
verbs:
- get
- update
- apiGroups:
- apps
resources:
- deployments
resourceNames:
- kube-state-metrics
verbs:
- get
- update

View File

@ -1,8 +1,4 @@
apiVersion: v1 apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-rules
namespace: monitoring
data: data:
etcd.yaml: |- etcd.yaml: |-
{ {
@ -10,6 +6,17 @@ data:
{ {
"name": "etcd", "name": "etcd",
"rules": [ "rules": [
{
"alert": "etcdMembersDown",
"annotations": {
"message": "etcd cluster \"{{ $labels.job }}\": members are down ({{ $value }})."
},
"expr": "max by (job) (\n sum by (job) (up{job=~\".*etcd.*\"} == bool 0)\nor\n count by (job,endpoint) (\n sum by (job,endpoint,To) (rate(etcd_network_peer_sent_failures_total{job=~\".*etcd.*\"}[3m])) > 0.01\n )\n)\n> 0\n",
"for": "3m",
"labels": {
"severity": "critical"
}
},
{ {
"alert": "etcdInsufficientMembers", "alert": "etcdInsufficientMembers",
"annotations": { "annotations": {
@ -135,30 +142,35 @@ data:
} }
] ]
} }
extra.yaml: |-
{
"groups": [
{
"name": "extra.rules",
"rules": [
{
"alert": "InactiveRAIDDisk",
"annotations": {
"message": "{{ $value }} RAID disk(s) on node {{ $labels.instance }} are inactive."
},
"expr": "node_md_disks - node_md_disks_active > 0",
"for": "10m",
"labels": {
"severity": "warning"
}
}
]
}
]
}
kube.yaml: |- kube.yaml: |-
{ {
"groups": [ "groups": [
{
"name": "kube-apiserver.rules",
"rules": [
{
"expr": "histogram_quantile(0.99, sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\"}[5m])) without(instance, pod))\n",
"labels": {
"quantile": "0.99"
},
"record": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile"
},
{
"expr": "histogram_quantile(0.9, sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\"}[5m])) without(instance, pod))\n",
"labels": {
"quantile": "0.9"
},
"record": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile"
},
{
"expr": "histogram_quantile(0.5, sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\"}[5m])) without(instance, pod))\n",
"labels": {
"quantile": "0.5"
},
"record": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile"
}
]
},
{ {
"name": "k8s.rules", "name": "k8s.rules",
"rules": [ "rules": [
@ -167,27 +179,35 @@ data:
"record": "namespace:container_cpu_usage_seconds_total:sum_rate" "record": "namespace:container_cpu_usage_seconds_total:sum_rate"
}, },
{ {
"expr": "sum by (namespace, pod, container) (\n rate(container_cpu_usage_seconds_total{job=\"kubernetes-cadvisor\", image!=\"\", container!=\"POD\"}[5m])\n)\n", "expr": "sum by (namespace, pod, container) (\n rate(container_cpu_usage_seconds_total{job=\"kubernetes-cadvisor\", image!=\"\", container!=\"POD\"}[5m])\n) * on (namespace, pod) group_left(node) max by(namespace, pod, node) (kube_pod_info)\n",
"record": "namespace_pod_container:container_cpu_usage_seconds_total:sum_rate" "record": "node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate"
},
{
"expr": "container_memory_working_set_bytes{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) max by(namespace, pod, node) (kube_pod_info)\n",
"record": "node_namespace_pod_container:container_memory_working_set_bytes"
},
{
"expr": "container_memory_rss{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) max by(namespace, pod, node) (kube_pod_info)\n",
"record": "node_namespace_pod_container:container_memory_rss"
},
{
"expr": "container_memory_cache{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) max by(namespace, pod, node) (kube_pod_info)\n",
"record": "node_namespace_pod_container:container_memory_cache"
},
{
"expr": "container_memory_swap{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) max by(namespace, pod, node) (kube_pod_info)\n",
"record": "node_namespace_pod_container:container_memory_swap"
}, },
{ {
"expr": "sum(container_memory_usage_bytes{job=\"kubernetes-cadvisor\", image!=\"\", container!=\"POD\"}) by (namespace)\n", "expr": "sum(container_memory_usage_bytes{job=\"kubernetes-cadvisor\", image!=\"\", container!=\"POD\"}) by (namespace)\n",
"record": "namespace:container_memory_usage_bytes:sum" "record": "namespace:container_memory_usage_bytes:sum"
}, },
{ {
"expr": "sum by (namespace, label_name) (\n sum(rate(container_cpu_usage_seconds_total{job=\"kubernetes-cadvisor\", image!=\"\", container!=\"POD\"}[5m])) by (namespace, pod)\n * on (namespace, pod)\n group_left(label_name) kube_pod_labels{job=\"kube-state-metrics\"}\n)\n", "expr": "sum by (namespace, label_name) (\n sum(kube_pod_container_resource_requests_memory_bytes{job=\"kube-state-metrics\"} * on (endpoint, instance, job, namespace, pod, service) group_left(phase) (kube_pod_status_phase{phase=~\"Pending|Running\"} == 1)) by (namespace, pod)\n * on (namespace, pod)\n group_left(label_name) kube_pod_labels{job=\"kube-state-metrics\"}\n)\n",
"record": "namespace:container_cpu_usage_seconds_total:sum_rate"
},
{
"expr": "sum by (namespace, label_name) (\n sum(container_memory_usage_bytes{job=\"kubernetes-cadvisor\",image!=\"\", container!=\"POD\"}) by (pod, namespace)\n * on (namespace, pod)\n group_left(label_name) kube_pod_labels{job=\"kube-state-metrics\"}\n)\n",
"record": "namespace:container_memory_usage_bytes:sum"
},
{
"expr": "sum by (namespace, label_name) (\n sum(kube_pod_container_resource_requests_memory_bytes{job=\"kube-state-metrics\"} * on (endpoint, instance, job, namespace, pod, service) group_left(phase) (kube_pod_status_phase{phase=~\"^(Pending|Running)$\"} == 1)) by (namespace, pod)\n * on (namespace, pod)\n group_left(label_name) kube_pod_labels{job=\"kube-state-metrics\"}\n)\n",
"record": "namespace:kube_pod_container_resource_requests_memory_bytes:sum" "record": "namespace:kube_pod_container_resource_requests_memory_bytes:sum"
}, },
{ {
"expr": "sum by (namespace, label_name) (\n sum(kube_pod_container_resource_requests_cpu_cores{job=\"kube-state-metrics\"} * on (endpoint, instance, job, namespace, pod, service) group_left(phase) (kube_pod_status_phase{phase=~\"^(Pending|Running)$\"} == 1)) by (namespace, pod)\n * on (namespace, pod)\n group_left(label_name) kube_pod_labels{job=\"kube-state-metrics\"}\n)\n", "expr": "sum by (namespace, label_name) (\n sum(kube_pod_container_resource_requests_cpu_cores{job=\"kube-state-metrics\"} * on (endpoint, instance, job, namespace, pod, service) group_left(phase) (kube_pod_status_phase{phase=~\"Pending|Running\"} == 1)) by (namespace, pod)\n * on (namespace, pod)\n group_left(label_name) kube_pod_labels{job=\"kube-state-metrics\"}\n)\n",
"record": "namespace:kube_pod_container_resource_requests_cpu_cores:sum" "record": "namespace:kube_pod_container_resource_requests_cpu_cores:sum"
}, },
{ {
@ -281,32 +301,6 @@ data:
} }
] ]
}, },
{
"name": "kube-apiserver.rules",
"rules": [
{
"expr": "histogram_quantile(0.99, sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\"}[5m])) without(instance, pod))\n",
"labels": {
"quantile": "0.99"
},
"record": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile"
},
{
"expr": "histogram_quantile(0.9, sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\"}[5m])) without(instance, pod))\n",
"labels": {
"quantile": "0.9"
},
"record": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile"
},
{
"expr": "histogram_quantile(0.5, sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\"}[5m])) without(instance, pod))\n",
"labels": {
"quantile": "0.5"
},
"record": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile"
}
]
},
{ {
"name": "node.rules", "name": "node.rules",
"rules": [ "rules": [
@ -323,169 +317,8 @@ data:
"record": "node:node_num_cpu:sum" "record": "node:node_num_cpu:sum"
}, },
{ {
"expr": "1 - avg(rate(node_cpu_seconds_total{job=\"node-exporter\",mode=\"idle\"}[1m]))\n", "expr": "sum(\n node_memory_MemAvailable_bytes{job=\"node-exporter\"} or\n (\n node_memory_Buffers_bytes{job=\"node-exporter\"} +\n node_memory_Cached_bytes{job=\"node-exporter\"} +\n node_memory_MemFree_bytes{job=\"node-exporter\"} +\n node_memory_Slab_bytes{job=\"node-exporter\"}\n )\n)\n",
"record": ":node_cpu_utilisation:avg1m" "record": ":node_memory_MemAvailable_bytes:sum"
},
{
"expr": "1 - avg by (node) (\n rate(node_cpu_seconds_total{job=\"node-exporter\",mode=\"idle\"}[1m])\n* on (namespace, pod) group_left(node)\n node_namespace_pod:kube_pod_info:)\n",
"record": "node:node_cpu_utilisation:avg1m"
},
{
"expr": "node:node_cpu_utilisation:avg1m\n *\nnode:node_num_cpu:sum\n /\nscalar(sum(node:node_num_cpu:sum))\n",
"record": "node:cluster_cpu_utilisation:ratio"
},
{
"expr": "sum(node_load1{job=\"node-exporter\"})\n/\nsum(node:node_num_cpu:sum)\n",
"record": ":node_cpu_saturation_load1:"
},
{
"expr": "sum by (node) (\n node_load1{job=\"node-exporter\"}\n* on (namespace, pod) group_left(node)\n node_namespace_pod:kube_pod_info:\n)\n/\nnode:node_num_cpu:sum\n",
"record": "node:node_cpu_saturation_load1:"
},
{
"expr": "1 -\nsum(node_memory_MemFree_bytes{job=\"node-exporter\"} + node_memory_Cached_bytes{job=\"node-exporter\"} + node_memory_Buffers_bytes{job=\"node-exporter\"})\n/\nsum(node_memory_MemTotal_bytes{job=\"node-exporter\"})\n",
"record": ":node_memory_utilisation:"
},
{
"expr": "sum(node_memory_MemFree_bytes{job=\"node-exporter\"} + node_memory_Cached_bytes{job=\"node-exporter\"} + node_memory_Buffers_bytes{job=\"node-exporter\"})\n",
"record": ":node_memory_MemFreeCachedBuffers_bytes:sum"
},
{
"expr": "sum(node_memory_MemTotal_bytes{job=\"node-exporter\"})\n",
"record": ":node_memory_MemTotal_bytes:sum"
},
{
"expr": "sum by (node) (\n (node_memory_MemFree_bytes{job=\"node-exporter\"} + node_memory_Cached_bytes{job=\"node-exporter\"} + node_memory_Buffers_bytes{job=\"node-exporter\"})\n * on (namespace, pod) group_left(node)\n node_namespace_pod:kube_pod_info:\n)\n",
"record": "node:node_memory_bytes_available:sum"
},
{
"expr": "sum by (node) (\n node_memory_MemTotal_bytes{job=\"node-exporter\"}\n * on (namespace, pod) group_left(node)\n node_namespace_pod:kube_pod_info:\n)\n",
"record": "node:node_memory_bytes_total:sum"
},
{
"expr": "(node:node_memory_bytes_total:sum - node:node_memory_bytes_available:sum)\n/\nnode:node_memory_bytes_total:sum\n",
"record": "node:node_memory_utilisation:ratio"
},
{
"expr": "(node:node_memory_bytes_total:sum - node:node_memory_bytes_available:sum)\n/\nscalar(sum(node:node_memory_bytes_total:sum))\n",
"record": "node:cluster_memory_utilisation:ratio"
},
{
"expr": "1e3 * sum(\n (rate(node_vmstat_pgpgin{job=\"node-exporter\"}[1m])\n + rate(node_vmstat_pgpgout{job=\"node-exporter\"}[1m]))\n)\n",
"record": ":node_memory_swap_io_bytes:sum_rate"
},
{
"expr": "1 -\nsum by (node) (\n (node_memory_MemFree_bytes{job=\"node-exporter\"} + node_memory_Cached_bytes{job=\"node-exporter\"} + node_memory_Buffers_bytes{job=\"node-exporter\"})\n* on (namespace, pod) group_left(node)\n node_namespace_pod:kube_pod_info:\n)\n/\nsum by (node) (\n node_memory_MemTotal_bytes{job=\"node-exporter\"}\n* on (namespace, pod) group_left(node)\n node_namespace_pod:kube_pod_info:\n)\n",
"record": "node:node_memory_utilisation:"
},
{
"expr": "1 - (node:node_memory_bytes_available:sum / node:node_memory_bytes_total:sum)\n",
"record": "node:node_memory_utilisation_2:"
},
{
"expr": "1e3 * sum by (node) (\n (rate(node_vmstat_pgpgin{job=\"node-exporter\"}[1m])\n + rate(node_vmstat_pgpgout{job=\"node-exporter\"}[1m]))\n * on (namespace, pod) group_left(node)\n node_namespace_pod:kube_pod_info:\n)\n",
"record": "node:node_memory_swap_io_bytes:sum_rate"
},
{
"expr": "avg(irate(node_disk_io_time_seconds_total{job=\"node-exporter\",device=~\"nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+\"}[1m]))\n",
"record": ":node_disk_utilisation:avg_irate"
},
{
"expr": "avg by (node) (\n irate(node_disk_io_time_seconds_total{job=\"node-exporter\",device=~\"nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+\"}[1m])\n* on (namespace, pod) group_left(node)\n node_namespace_pod:kube_pod_info:\n)\n",
"record": "node:node_disk_utilisation:avg_irate"
},
{
"expr": "avg(irate(node_disk_io_time_weighted_seconds_total{job=\"node-exporter\",device=~\"nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+\"}[1m]))\n",
"record": ":node_disk_saturation:avg_irate"
},
{
"expr": "avg by (node) (\n irate(node_disk_io_time_weighted_seconds_total{job=\"node-exporter\",device=~\"nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+\"}[1m])\n* on (namespace, pod) group_left(node)\n node_namespace_pod:kube_pod_info:\n)\n",
"record": "node:node_disk_saturation:avg_irate"
},
{
"expr": "max by (instance, namespace, pod, device) ((node_filesystem_size_bytes{fstype=~\"ext[234]|btrfs|xfs|zfs\"}\n- node_filesystem_avail_bytes{fstype=~\"ext[234]|btrfs|xfs|zfs\"})\n/ node_filesystem_size_bytes{fstype=~\"ext[234]|btrfs|xfs|zfs\"})\n",
"record": "node:node_filesystem_usage:"
},
{
"expr": "max by (instance, namespace, pod, device) (node_filesystem_avail_bytes{fstype=~\"ext[234]|btrfs|xfs|zfs\"} / node_filesystem_size_bytes{fstype=~\"ext[234]|btrfs|xfs|zfs\"})\n",
"record": "node:node_filesystem_avail:"
},
{
"expr": "sum(irate(node_network_receive_bytes_total{job=\"node-exporter\",device!~\"veth.+\"}[1m])) +\nsum(irate(node_network_transmit_bytes_total{job=\"node-exporter\",device!~\"veth.+\"}[1m]))\n",
"record": ":node_net_utilisation:sum_irate"
},
{
"expr": "sum by (node) (\n (irate(node_network_receive_bytes_total{job=\"node-exporter\",device!~\"veth.+\"}[1m]) +\n irate(node_network_transmit_bytes_total{job=\"node-exporter\",device!~\"veth.+\"}[1m]))\n* on (namespace, pod) group_left(node)\n node_namespace_pod:kube_pod_info:\n)\n",
"record": "node:node_net_utilisation:sum_irate"
},
{
"expr": "sum(irate(node_network_receive_drop_total{job=\"node-exporter\",device!~\"veth.+\"}[1m])) +\nsum(irate(node_network_transmit_drop_total{job=\"node-exporter\",device!~\"veth.+\"}[1m]))\n",
"record": ":node_net_saturation:sum_irate"
},
{
"expr": "sum by (node) (\n (irate(node_network_receive_drop_total{job=\"node-exporter\",device!~\"veth.+\"}[1m]) +\n irate(node_network_transmit_drop_total{job=\"node-exporter\",device!~\"veth.+\"}[1m]))\n* on (namespace, pod) group_left(node)\n node_namespace_pod:kube_pod_info:\n)\n",
"record": "node:node_net_saturation:sum_irate"
},
{
"expr": "max(\n max(\n kube_pod_info{job=\"kube-state-metrics\", host_ip!=\"\"}\n ) by (node, host_ip)\n * on (host_ip) group_right (node)\n label_replace(\n (max(node_filesystem_files{job=\"node-exporter\", mountpoint=\"/\"}) by (instance)), \"host_ip\", \"$1\", \"instance\", \"(.*):.*\"\n )\n) by (node)\n",
"record": "node:node_inodes_total:"
},
{
"expr": "max(\n max(\n kube_pod_info{job=\"kube-state-metrics\", host_ip!=\"\"}\n ) by (node, host_ip)\n * on (host_ip) group_right (node)\n label_replace(\n (max(node_filesystem_files_free{job=\"node-exporter\", mountpoint=\"/\"}) by (instance)), \"host_ip\", \"$1\", \"instance\", \"(.*):.*\"\n )\n) by (node)\n",
"record": "node:node_inodes_free:"
}
]
},
{
"name": "kubernetes-absent",
"rules": [
{
"alert": "KubeAPIDown",
"annotations": {
"message": "KubeAPI has disappeared from Prometheus target discovery.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapidown"
},
"expr": "absent(up{job=\"apiserver\"} == 1)\n",
"for": "15m",
"labels": {
"severity": "critical"
}
},
{
"alert": "KubeControllerManagerDown",
"annotations": {
"message": "KubeControllerManager has disappeared from Prometheus target discovery.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecontrollermanagerdown"
},
"expr": "absent(up{job=\"kube-controller-manager\"} == 1)\n",
"for": "15m",
"labels": {
"severity": "critical"
}
},
{
"alert": "KubeSchedulerDown",
"annotations": {
"message": "KubeScheduler has disappeared from Prometheus target discovery.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeschedulerdown"
},
"expr": "absent(up{job=\"kube-scheduler\"} == 1)\n",
"for": "15m",
"labels": {
"severity": "critical"
}
},
{
"alert": "KubeletDown",
"annotations": {
"message": "Kubelet has disappeared from Prometheus target discovery.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeletdown"
},
"expr": "absent(up{job=\"kubelet\"} == 1)\n",
"for": "15m",
"labels": {
"severity": "critical"
}
} }
] ]
}, },
@ -499,7 +332,7 @@ data:
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodcrashlooping" "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodcrashlooping"
}, },
"expr": "rate(kube_pod_container_status_restarts_total{job=\"kube-state-metrics\"}[15m]) * 60 * 5 > 0\n", "expr": "rate(kube_pod_container_status_restarts_total{job=\"kube-state-metrics\"}[15m]) * 60 * 5 > 0\n",
"for": "1h", "for": "15m",
"labels": { "labels": {
"severity": "critical" "severity": "critical"
} }
@ -507,11 +340,11 @@ data:
{ {
"alert": "KubePodNotReady", "alert": "KubePodNotReady",
"annotations": { "annotations": {
"message": "Pod {{ $labels.namespace }}/{{ $labels.pod }} has been in a non-ready state for longer than an hour.", "message": "Pod {{ $labels.namespace }}/{{ $labels.pod }} has been in a non-ready state for longer than 15 minutes.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodnotready" "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodnotready"
}, },
"expr": "sum by (namespace, pod) (kube_pod_status_phase{job=\"kube-state-metrics\", phase=~\"Failed|Pending|Unknown\"}) > 0\n", "expr": "sum by (namespace, pod) (kube_pod_status_phase{job=\"kube-state-metrics\", phase=~\"Failed|Pending|Unknown\"} * on(namespace, pod) group_left(owner_kind) kube_pod_owner{owner_kind!=\"Job\"}) > 0\n",
"for": "1h", "for": "15m",
"labels": { "labels": {
"severity": "critical" "severity": "critical"
} }
@ -531,11 +364,11 @@ data:
{ {
"alert": "KubeDeploymentReplicasMismatch", "alert": "KubeDeploymentReplicasMismatch",
"annotations": { "annotations": {
"message": "Deployment {{ $labels.namespace }}/{{ $labels.deployment }} has not matched the expected number of replicas for longer than an hour.", "message": "Deployment {{ $labels.namespace }}/{{ $labels.deployment }} has not matched the expected number of replicas for longer than 15 minutes.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedeploymentreplicasmismatch" "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedeploymentreplicasmismatch"
}, },
"expr": "kube_deployment_spec_replicas{job=\"kube-state-metrics\"}\n !=\nkube_deployment_status_replicas_available{job=\"kube-state-metrics\"}\n", "expr": "kube_deployment_spec_replicas{job=\"kube-state-metrics\"}\n !=\nkube_deployment_status_replicas_available{job=\"kube-state-metrics\"}\n",
"for": "1h", "for": "15m",
"labels": { "labels": {
"severity": "critical" "severity": "critical"
} }
@ -579,15 +412,27 @@ data:
{ {
"alert": "KubeDaemonSetRolloutStuck", "alert": "KubeDaemonSetRolloutStuck",
"annotations": { "annotations": {
"message": "Only {{ $value }}% of the desired Pods of DaemonSet {{ $labels.namespace }}/{{ $labels.daemonset }} are scheduled and ready.", "message": "Only {{ $value | humanizePercentage }} of the desired Pods of DaemonSet {{ $labels.namespace }}/{{ $labels.daemonset }} are scheduled and ready.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedaemonsetrolloutstuck" "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedaemonsetrolloutstuck"
}, },
"expr": "kube_daemonset_status_number_ready{job=\"kube-state-metrics\"}\n /\nkube_daemonset_status_desired_number_scheduled{job=\"kube-state-metrics\"} * 100 < 100\n", "expr": "kube_daemonset_status_number_ready{job=\"kube-state-metrics\"}\n /\nkube_daemonset_status_desired_number_scheduled{job=\"kube-state-metrics\"} < 1.00\n",
"for": "15m", "for": "15m",
"labels": { "labels": {
"severity": "critical" "severity": "critical"
} }
}, },
{
"alert": "KubeContainerWaiting",
"annotations": {
"message": "Pod {{ $labels.namespace }}/{{ $labels.pod }} container {{ $labels.container}} has been in waiting state for longer than 1 hour.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecontainerwaiting"
},
"expr": "sum by (namespace, pod, container) (kube_pod_container_status_waiting_reason{job=\"kube-state-metrics\"}) > 0\n",
"for": "1h",
"labels": {
"severity": "warning"
}
},
{ {
"alert": "KubeDaemonSetNotScheduled", "alert": "KubeDaemonSetNotScheduled",
"annotations": { "annotations": {
@ -642,8 +487,32 @@ data:
"message": "Job {{ $labels.namespace }}/{{ $labels.job_name }} failed to complete.", "message": "Job {{ $labels.namespace }}/{{ $labels.job_name }} failed to complete.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubejobfailed" "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubejobfailed"
}, },
"expr": "kube_job_status_failed{job=\"kube-state-metrics\"} > 0\n", "expr": "kube_job_failed{job=\"kube-state-metrics\"} > 0\n",
"for": "1h", "for": "15m",
"labels": {
"severity": "warning"
}
},
{
"alert": "KubeHpaReplicasMismatch",
"annotations": {
"message": "HPA {{ $labels.namespace }}/{{ $labels.hpa }} has not matched the desired number of replicas for longer than 15 minutes.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubehpareplicasmismatch"
},
"expr": "(kube_hpa_status_desired_replicas{job=\"kube-state-metrics\"}\n !=\nkube_hpa_status_current_replicas{job=\"kube-state-metrics\"})\n and\nchanges(kube_hpa_status_current_replicas[15m]) == 0\n",
"for": "15m",
"labels": {
"severity": "warning"
}
},
{
"alert": "KubeHpaMaxedOut",
"annotations": {
"message": "HPA {{ $labels.namespace }}/{{ $labels.hpa }} has been running at max replicas for longer than 15 minutes.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubehpamaxedout"
},
"expr": "kube_hpa_status_current_replicas{job=\"kube-state-metrics\"}\n ==\nkube_hpa_spec_max_replicas{job=\"kube-state-metrics\"}\n",
"for": "15m",
"labels": { "labels": {
"severity": "warning" "severity": "warning"
} }
@ -704,10 +573,10 @@ data:
{ {
"alert": "KubeQuotaExceeded", "alert": "KubeQuotaExceeded",
"annotations": { "annotations": {
"message": "Namespace {{ $labels.namespace }} is using {{ printf \"%0.0f\" $value }}% of its {{ $labels.resource }} quota.", "message": "Namespace {{ $labels.namespace }} is using {{ $value | humanizePercentage }} of its {{ $labels.resource }} quota.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubequotaexceeded" "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubequotaexceeded"
}, },
"expr": "100 * kube_resourcequota{job=\"kube-state-metrics\", type=\"used\"}\n / ignoring(instance, job, type)\n(kube_resourcequota{job=\"kube-state-metrics\", type=\"hard\"} > 0)\n > 90\n", "expr": "kube_resourcequota{job=\"kube-state-metrics\", type=\"used\"}\n / ignoring(instance, job, type)\n(kube_resourcequota{job=\"kube-state-metrics\", type=\"hard\"} > 0)\n > 0.90\n",
"for": "15m", "for": "15m",
"labels": { "labels": {
"severity": "warning" "severity": "warning"
@ -716,10 +585,10 @@ data:
{ {
"alert": "CPUThrottlingHigh", "alert": "CPUThrottlingHigh",
"annotations": { "annotations": {
"message": "{{ printf \"%0.0f\" $value }}% throttling of CPU in namespace {{ $labels.namespace }} for container {{ $labels.container }} in pod {{ $labels.pod }}.", "message": "{{ $value | humanizePercentage }} throttling of CPU in namespace {{ $labels.namespace }} for container {{ $labels.container }} in pod {{ $labels.pod }}.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-cputhrottlinghigh" "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-cputhrottlinghigh"
}, },
"expr": "100 * sum(increase(container_cpu_cfs_throttled_periods_total{container!=\"\", }[5m])) by (container, pod, namespace)\n /\nsum(increase(container_cpu_cfs_periods_total{}[5m])) by (container, pod, namespace)\n > 100 \n", "expr": "sum(increase(container_cpu_cfs_throttled_periods_total{container!=\"\", }[5m])) by (container, pod, namespace)\n /\nsum(increase(container_cpu_cfs_periods_total{}[5m])) by (container, pod, namespace)\n > ( 100 / 100 )\n",
"for": "15m", "for": "15m",
"labels": { "labels": {
"severity": "warning" "severity": "warning"
@ -733,10 +602,10 @@ data:
{ {
"alert": "KubePersistentVolumeUsageCritical", "alert": "KubePersistentVolumeUsageCritical",
"annotations": { "annotations": {
"message": "The PersistentVolume claimed by {{ $labels.persistentvolumeclaim }} in Namespace {{ $labels.namespace }} is only {{ printf \"%0.2f\" $value }}% free.", "message": "The PersistentVolume claimed by {{ $labels.persistentvolumeclaim }} in Namespace {{ $labels.namespace }} is only {{ $value | humanizePercentage }} free.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepersistentvolumeusagecritical" "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepersistentvolumeusagecritical"
}, },
"expr": "100 * kubelet_volume_stats_available_bytes{job=\"kubelet\"}\n /\nkubelet_volume_stats_capacity_bytes{job=\"kubelet\"}\n < 3\n", "expr": "kubelet_volume_stats_available_bytes{job=\"kubelet\"}\n /\nkubelet_volume_stats_capacity_bytes{job=\"kubelet\"}\n < 0.03\n",
"for": "1m", "for": "1m",
"labels": { "labels": {
"severity": "critical" "severity": "critical"
@ -745,10 +614,10 @@ data:
{ {
"alert": "KubePersistentVolumeFullInFourDays", "alert": "KubePersistentVolumeFullInFourDays",
"annotations": { "annotations": {
"message": "Based on recent sampling, the PersistentVolume claimed by {{ $labels.persistentvolumeclaim }} in Namespace {{ $labels.namespace }} is expected to fill up within four days. Currently {{ printf \"%0.2f\" $value }}% is available.", "message": "Based on recent sampling, the PersistentVolume claimed by {{ $labels.persistentvolumeclaim }} in Namespace {{ $labels.namespace }} is expected to fill up within four days. Currently {{ $value | humanizePercentage }} is available.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepersistentvolumefullinfourdays" "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepersistentvolumefullinfourdays"
}, },
"expr": "100 * (\n kubelet_volume_stats_available_bytes{job=\"kubelet\"}\n /\n kubelet_volume_stats_capacity_bytes{job=\"kubelet\"}\n) < 15\nand\npredict_linear(kubelet_volume_stats_available_bytes{job=\"kubelet\"}[6h], 4 * 24 * 3600) < 0\n", "expr": "(\n kubelet_volume_stats_available_bytes{job=\"kubelet\"}\n /\n kubelet_volume_stats_capacity_bytes{job=\"kubelet\"}\n) < 0.15\nand\npredict_linear(kubelet_volume_stats_available_bytes{job=\"kubelet\"}[6h], 4 * 24 * 3600) < 0\n",
"for": "5m", "for": "5m",
"labels": { "labels": {
"severity": "critical" "severity": "critical"
@ -771,37 +640,13 @@ data:
{ {
"name": "kubernetes-system", "name": "kubernetes-system",
"rules": [ "rules": [
{
"alert": "KubeNodeNotReady",
"annotations": {
"message": "{{ $labels.node }} has been unready for more than an hour.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubenodenotready"
},
"expr": "kube_node_status_condition{job=\"kube-state-metrics\",condition=\"Ready\",status=\"true\"} == 0\n",
"for": "1h",
"labels": {
"severity": "warning"
}
},
{ {
"alert": "KubeVersionMismatch", "alert": "KubeVersionMismatch",
"annotations": { "annotations": {
"message": "There are {{ $value }} different semantic versions of Kubernetes components running.", "message": "There are {{ $value }} different semantic versions of Kubernetes components running.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeversionmismatch" "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeversionmismatch"
}, },
"expr": "count(count by (gitVersion) (label_replace(kubernetes_build_info{job!=\"coredns\"},\"gitVersion\",\"$1\",\"gitVersion\",\"(v[0-9]*.[0-9]*.[0-9]*).*\"))) > 1\n", "expr": "count(count by (gitVersion) (label_replace(kubernetes_build_info{job!~\"kube-dns|coredns\"},\"gitVersion\",\"$1\",\"gitVersion\",\"(v[0-9]*.[0-9]*.[0-9]*).*\"))) > 1\n",
"for": "1h",
"labels": {
"severity": "warning"
}
},
{
"alert": "KubeClientErrors",
"annotations": {
"message": "Kubernetes API server client '{{ $labels.job }}/{{ $labels.instance }}' is experiencing {{ printf \"%0.0f\" $value }}% errors.'",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclienterrors"
},
"expr": "(sum(rate(rest_client_requests_total{code=~\"5..\"}[5m])) by (instance, job)\n /\nsum(rate(rest_client_requests_total[5m])) by (instance, job))\n* 100 > 1\n",
"for": "15m", "for": "15m",
"labels": { "labels": {
"severity": "warning" "severity": "warning"
@ -810,34 +655,27 @@ data:
{ {
"alert": "KubeClientErrors", "alert": "KubeClientErrors",
"annotations": { "annotations": {
"message": "Kubernetes API server client '{{ $labels.job }}/{{ $labels.instance }}' is experiencing {{ printf \"%0.0f\" $value }} errors / second.", "message": "Kubernetes API server client '{{ $labels.job }}/{{ $labels.instance }}' is experiencing {{ $value | humanizePercentage }} errors.'",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclienterrors" "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclienterrors"
}, },
"expr": "sum(rate(ksm_scrape_error_total{job=\"kube-state-metrics\"}[5m])) by (instance, job) > 0.1\n", "expr": "(sum(rate(rest_client_requests_total{code=~\"5..\"}[5m])) by (instance, job)\n /\nsum(rate(rest_client_requests_total[5m])) by (instance, job))\n> 0.01\n",
"for": "15m", "for": "15m",
"labels": { "labels": {
"severity": "warning" "severity": "warning"
} }
}, }
{ ]
"alert": "KubeletTooManyPods", },
"annotations": { {
"message": "Kubelet {{ $labels.instance }} is running {{ $value }} Pods, close to the limit of 110.", "name": "kubernetes-system-apiserver",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubelettoomanypods" "rules": [
},
"expr": "kubelet_running_pod_count{job=\"kubelet\"} > 110 * 0.9\n",
"for": "15m",
"labels": {
"severity": "warning"
}
},
{ {
"alert": "KubeAPILatencyHigh", "alert": "KubeAPILatencyHigh",
"annotations": { "annotations": {
"message": "The API server has a 99th percentile latency of {{ $value }} seconds for {{ $labels.verb }} {{ $labels.resource }}.", "message": "The API server has a 99th percentile latency of {{ $value }} seconds for {{ $labels.verb }} {{ $labels.resource }}.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapilatencyhigh" "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapilatencyhigh"
}, },
"expr": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile{job=\"apiserver\",quantile=\"0.99\",subresource!=\"log\",verb!~\"^(?:LIST|WATCH|WATCHLIST|PROXY|CONNECT)$\"} > 1\n", "expr": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile{job=\"apiserver\",quantile=\"0.99\",subresource!=\"log\",verb!~\"LIST|WATCH|WATCHLIST|PROXY|CONNECT\"} > 1\n",
"for": "10m", "for": "10m",
"labels": { "labels": {
"severity": "warning" "severity": "warning"
@ -849,7 +687,7 @@ data:
"message": "The API server has a 99th percentile latency of {{ $value }} seconds for {{ $labels.verb }} {{ $labels.resource }}.", "message": "The API server has a 99th percentile latency of {{ $value }} seconds for {{ $labels.verb }} {{ $labels.resource }}.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapilatencyhigh" "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapilatencyhigh"
}, },
"expr": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile{job=\"apiserver\",quantile=\"0.99\",subresource!=\"log\",verb!~\"^(?:LIST|WATCH|WATCHLIST|PROXY|CONNECT)$\"} > 4\n", "expr": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile{job=\"apiserver\",quantile=\"0.99\",subresource!=\"log\",verb!~\"LIST|WATCH|WATCHLIST|PROXY|CONNECT\"} > 4\n",
"for": "10m", "for": "10m",
"labels": { "labels": {
"severity": "critical" "severity": "critical"
@ -858,10 +696,10 @@ data:
{ {
"alert": "KubeAPIErrorsHigh", "alert": "KubeAPIErrorsHigh",
"annotations": { "annotations": {
"message": "API server is returning errors for {{ $value }}% of requests.", "message": "API server is returning errors for {{ $value | humanizePercentage }} of requests.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapierrorshigh" "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapierrorshigh"
}, },
"expr": "sum(rate(apiserver_request_total{job=\"apiserver\",code=~\"^(?:5..)$\"}[5m]))\n /\nsum(rate(apiserver_request_total{job=\"apiserver\"}[5m])) * 100 > 3\n", "expr": "sum(rate(apiserver_request_total{job=\"apiserver\",code=~\"5..\"}[5m]))\n /\nsum(rate(apiserver_request_total{job=\"apiserver\"}[5m])) > 0.03\n",
"for": "10m", "for": "10m",
"labels": { "labels": {
"severity": "critical" "severity": "critical"
@ -870,10 +708,10 @@ data:
{ {
"alert": "KubeAPIErrorsHigh", "alert": "KubeAPIErrorsHigh",
"annotations": { "annotations": {
"message": "API server is returning errors for {{ $value }}% of requests.", "message": "API server is returning errors for {{ $value | humanizePercentage }} of requests.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapierrorshigh" "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapierrorshigh"
}, },
"expr": "sum(rate(apiserver_request_total{job=\"apiserver\",code=~\"^(?:5..)$\"}[5m]))\n /\nsum(rate(apiserver_request_total{job=\"apiserver\"}[5m])) * 100 > 1\n", "expr": "sum(rate(apiserver_request_total{job=\"apiserver\",code=~\"5..\"}[5m]))\n /\nsum(rate(apiserver_request_total{job=\"apiserver\"}[5m])) > 0.01\n",
"for": "10m", "for": "10m",
"labels": { "labels": {
"severity": "warning" "severity": "warning"
@ -882,10 +720,10 @@ data:
{ {
"alert": "KubeAPIErrorsHigh", "alert": "KubeAPIErrorsHigh",
"annotations": { "annotations": {
"message": "API server is returning errors for {{ $value }}% of requests for {{ $labels.verb }} {{ $labels.resource }} {{ $labels.subresource }}.", "message": "API server is returning errors for {{ $value | humanizePercentage }} of requests for {{ $labels.verb }} {{ $labels.resource }} {{ $labels.subresource }}.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapierrorshigh" "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapierrorshigh"
}, },
"expr": "sum(rate(apiserver_request_total{job=\"apiserver\",code=~\"^(?:5..)$\"}[5m])) by (resource,subresource,verb)\n /\nsum(rate(apiserver_request_total{job=\"apiserver\"}[5m])) by (resource,subresource,verb) * 100 > 10\n", "expr": "sum(rate(apiserver_request_total{job=\"apiserver\",code=~\"5..\"}[5m])) by (resource,subresource,verb)\n /\nsum(rate(apiserver_request_total{job=\"apiserver\"}[5m])) by (resource,subresource,verb) > 0.10\n",
"for": "10m", "for": "10m",
"labels": { "labels": {
"severity": "critical" "severity": "critical"
@ -894,10 +732,10 @@ data:
{ {
"alert": "KubeAPIErrorsHigh", "alert": "KubeAPIErrorsHigh",
"annotations": { "annotations": {
"message": "API server is returning errors for {{ $value }}% of requests for {{ $labels.verb }} {{ $labels.resource }} {{ $labels.subresource }}.", "message": "API server is returning errors for {{ $value | humanizePercentage }} of requests for {{ $labels.verb }} {{ $labels.resource }} {{ $labels.subresource }}.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapierrorshigh" "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapierrorshigh"
}, },
"expr": "sum(rate(apiserver_request_total{job=\"apiserver\",code=~\"^(?:5..)$\"}[5m])) by (resource,subresource,verb)\n /\nsum(rate(apiserver_request_total{job=\"apiserver\"}[5m])) by (resource,subresource,verb) * 100 > 5\n", "expr": "sum(rate(apiserver_request_total{job=\"apiserver\",code=~\"5..\"}[5m])) by (resource,subresource,verb)\n /\nsum(rate(apiserver_request_total{job=\"apiserver\"}[5m])) by (resource,subresource,verb) > 0.05\n",
"for": "10m", "for": "10m",
"labels": { "labels": {
"severity": "warning" "severity": "warning"
@ -924,68 +762,15 @@ data:
"labels": { "labels": {
"severity": "critical" "severity": "critical"
} }
}
]
}
]
}
kubeprom.yaml: |-
{
"groups": [
{
"name": "kube-prometheus-node-recording.rules",
"rules": [
{
"expr": "sum(rate(node_cpu_seconds_total{mode!=\"idle\",mode!=\"iowait\"}[3m])) BY (instance)",
"record": "instance:node_cpu:rate:sum"
}, },
{ {
"expr": "sum((node_filesystem_size_bytes{mountpoint=\"/\"} - node_filesystem_free_bytes{mountpoint=\"/\"})) BY (instance)", "alert": "KubeAPIDown",
"record": "instance:node_filesystem_usage:sum"
},
{
"expr": "sum(rate(node_network_receive_bytes_total[3m])) BY (instance)",
"record": "instance:node_network_receive_bytes:rate:sum"
},
{
"expr": "sum(rate(node_network_transmit_bytes_total[3m])) BY (instance)",
"record": "instance:node_network_transmit_bytes:rate:sum"
},
{
"expr": "sum(rate(node_cpu_seconds_total{mode!=\"idle\",mode!=\"iowait\"}[5m])) WITHOUT (cpu, mode) / ON(instance) GROUP_LEFT() count(sum(node_cpu_seconds_total) BY (instance, cpu)) BY (instance)",
"record": "instance:node_cpu:ratio"
},
{
"expr": "sum(rate(node_cpu_seconds_total{mode!=\"idle\",mode!=\"iowait\"}[5m]))",
"record": "cluster:node_cpu:sum_rate5m"
},
{
"expr": "cluster:node_cpu_seconds_total:rate5m / count(sum(node_cpu_seconds_total) BY (instance, cpu))",
"record": "cluster:node_cpu:ratio"
}
]
},
{
"name": "kube-prometheus-node-alerting.rules",
"rules": [
{
"alert": "NodeDiskRunningFull",
"annotations": { "annotations": {
"message": "Device {{ $labels.device }} of node-exporter {{ $labels.namespace }}/{{ $labels.pod }} will be full within the next 24 hours." "message": "KubeAPI has disappeared from Prometheus target discovery.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapidown"
}, },
"expr": "(node:node_filesystem_usage: > 0.85) and (predict_linear(node:node_filesystem_avail:[6h], 3600 * 24) < 0)\n", "expr": "absent(up{job=\"apiserver\"} == 1)\n",
"for": "30m", "for": "15m",
"labels": {
"severity": "warning"
}
},
{
"alert": "NodeDiskRunningFull",
"annotations": {
"message": "Device {{ $labels.device }} of node-exporter {{ $labels.namespace }}/{{ $labels.pod }} will be full within the next 2 hours."
},
"expr": "(node:node_filesystem_usage: > 0.85) and (predict_linear(node:node_filesystem_avail:[30m], 3600 * 2) < 0)\n",
"for": "10m",
"labels": { "labels": {
"severity": "critical" "severity": "critical"
} }
@ -993,71 +778,87 @@ data:
] ]
}, },
{ {
"name": "node-time", "name": "kubernetes-system-kubelet",
"rules": [ "rules": [
{ {
"alert": "ClockSkewDetected", "alert": "KubeNodeNotReady",
"annotations": { "annotations": {
"message": "Clock skew detected on node-exporter {{ $labels.namespace }}/{{ $labels.pod }}. Ensure NTP is configured correctly on this host." "message": "{{ $labels.node }} has been unready for more than 15 minutes.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubenodenotready"
}, },
"expr": "abs(node_timex_offset_seconds{job=\"node-exporter\"}) > 0.03\n", "expr": "kube_node_status_condition{job=\"kube-state-metrics\",condition=\"Ready\",status=\"true\"} == 0\n",
"for": "2m", "for": "15m",
"labels": {
"severity": "warning"
}
}
]
},
{
"name": "node-network",
"rules": [
{
"alert": "NetworkReceiveErrors",
"annotations": {
"message": "Network interface \"{{ $labels.device }}\" showing receive errors on node-exporter {{ $labels.namespace }}/{{ $labels.pod }}\""
},
"expr": "rate(node_network_receive_errs_total{job=\"node-exporter\",device!~\"veth.+|tunl.+\"}[2m]) > 0\n",
"for": "2m",
"labels": { "labels": {
"severity": "warning" "severity": "warning"
} }
}, },
{ {
"alert": "NetworkTransmitErrors", "alert": "KubeNodeUnreachable",
"annotations": { "annotations": {
"message": "Network interface \"{{ $labels.device }}\" showing transmit errors on node-exporter {{ $labels.namespace }}/{{ $labels.pod }}\"" "message": "{{ $labels.node }} is unreachable and some workloads may be rescheduled.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubenodeunreachable"
}, },
"expr": "rate(node_network_transmit_errs_total{job=\"node-exporter\",device!~\"veth.+|tunl.+\"}[2m]) > 0\n", "expr": "kube_node_spec_taint{job=\"kube-state-metrics\",key=\"node.kubernetes.io/unreachable\",effect=\"NoSchedule\"} == 1\n",
"for": "2m",
"labels": { "labels": {
"severity": "warning" "severity": "warning"
} }
}, },
{ {
"alert": "NodeNetworkInterfaceFlapping", "alert": "KubeletTooManyPods",
"annotations": { "annotations": {
"message": "Network interface \"{{ $labels.device }}\" changing it's up status often on node-exporter {{ $labels.namespace }}/{{ $labels.pod }}\"" "message": "Kubelet '{{ $labels.node }}' is running at {{ $value | humanizePercentage }} of its Pod capacity.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubelettoomanypods"
}, },
"expr": "changes(node_network_up{job=\"node-exporter\",device!~\"veth.+|tunl.+\"}[2m]) > 2\n", "expr": "max(max(kubelet_running_pod_count{job=\"kubelet\"}) by(instance) * on(instance) group_left(node) kubelet_node_name{job=\"kubelet\"}) by(node) / max(kube_node_status_capacity_pods{job=\"kube-state-metrics\"}) by(node) > 0.95\n",
"for": "2m", "for": "15m",
"labels": { "labels": {
"severity": "warning" "severity": "warning"
} }
},
{
"alert": "KubeletDown",
"annotations": {
"message": "Kubelet has disappeared from Prometheus target discovery.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeletdown"
},
"expr": "absent(up{job=\"kubelet\"} == 1)\n",
"for": "15m",
"labels": {
"severity": "critical"
}
} }
] ]
}, },
{ {
"name": "general.rules", "name": "kubernetes-system-scheduler",
"rules": [ "rules": [
{ {
"alert": "TargetDown", "alert": "KubeSchedulerDown",
"annotations": { "annotations": {
"message": "{{ $value }}% of the {{ $labels.job }} targets are down." "message": "KubeScheduler has disappeared from Prometheus target discovery.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeschedulerdown"
}, },
"expr": "100 * (count(up == 0) BY (job) / count(up) BY (job)) > 10", "expr": "absent(up{job=\"kube-scheduler\"} == 1)\n",
"for": "10m", "for": "15m",
"labels": { "labels": {
"severity": "warning" "severity": "critical"
}
}
]
},
{
"name": "kubernetes-system-controller-manager",
"rules": [
{
"alert": "KubeControllerManagerDown",
"annotations": {
"message": "KubeControllerManager has disappeared from Prometheus target discovery.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecontrollermanagerdown"
},
"expr": "absent(up{job=\"kube-controller-manager\"} == 1)\n",
"for": "15m",
"labels": {
"severity": "critical"
} }
} }
] ]
@ -1154,18 +955,6 @@ data:
"severity": "warning" "severity": "warning"
} }
}, },
{
"alert": "PrometheusTSDBWALCorruptions",
"annotations": {
"description": "Prometheus {{$labels.instance}} has detected {{$value | humanize}} corruptions of the write-ahead log (WAL) over the last 3h.",
"summary": "Prometheus is detecting WAL corruptions."
},
"expr": "increase(tsdb_wal_corruptions_total{job=\"prometheus\"}[3h]) > 0\n",
"for": "4h",
"labels": {
"severity": "warning"
}
},
{ {
"alert": "PrometheusNotIngestingSamples", "alert": "PrometheusNotIngestingSamples",
"annotations": { "annotations": {
@ -1181,7 +970,7 @@ data:
{ {
"alert": "PrometheusDuplicateTimestamps", "alert": "PrometheusDuplicateTimestamps",
"annotations": { "annotations": {
"description": "Prometheus {{$labels.instance}} is dropping {{$value | humanize}} samples/s with different values but duplicated timestamp.", "description": "Prometheus {{$labels.instance}} is dropping {{ printf \"%.4g\" $value }} samples/s with different values but duplicated timestamp.",
"summary": "Prometheus is dropping samples with duplicate timestamps." "summary": "Prometheus is dropping samples with duplicate timestamps."
}, },
"expr": "rate(prometheus_target_scrapes_sample_duplicate_timestamp_total{job=\"prometheus\"}[5m]) > 0\n", "expr": "rate(prometheus_target_scrapes_sample_duplicate_timestamp_total{job=\"prometheus\"}[5m]) > 0\n",
@ -1193,7 +982,7 @@ data:
{ {
"alert": "PrometheusOutOfOrderTimestamps", "alert": "PrometheusOutOfOrderTimestamps",
"annotations": { "annotations": {
"description": "Prometheus {{$labels.instance}} is dropping {{$value | humanize}} samples/s with timestamps arriving out of order.", "description": "Prometheus {{$labels.instance}} is dropping {{ printf \"%.4g\" $value }} samples/s with timestamps arriving out of order.",
"summary": "Prometheus drops samples with out-of-order timestamps." "summary": "Prometheus drops samples with out-of-order timestamps."
}, },
"expr": "rate(prometheus_target_scrapes_sample_out_of_order_total{job=\"prometheus\"}[5m]) > 0\n", "expr": "rate(prometheus_target_scrapes_sample_out_of_order_total{job=\"prometheus\"}[5m]) > 0\n",
@ -1226,6 +1015,18 @@ data:
"severity": "critical" "severity": "critical"
} }
}, },
{
"alert": "PrometheusRemoteWriteDesiredShards",
"annotations": {
"description": "Prometheus {{$labels.instance}} remote write desired shards calculation wants to run {{ printf $value }} shards, which is more than the max of {{ printf `prometheus_remote_storage_shards_max{instance=\"%s\",job=\"prometheus\"}` $labels.instance | query | first | value }}.",
"summary": "Prometheus remote write desired shards calculation wants to run more than configured max shards."
},
"expr": "# Without max_over_time, failed scrapes could create false negatives, see\n# https://www.robustperception.io/alerting-on-gauges-in-prometheus-2-0 for details.\n(\n max_over_time(prometheus_remote_storage_shards_desired{job=\"prometheus\"}[5m])\n> on(job, instance) group_right\n max_over_time(prometheus_remote_storage_shards_max{job=\"prometheus\"}[5m])\n)\n",
"for": "15m",
"labels": {
"severity": "warning"
}
},
{ {
"alert": "PrometheusRuleFailures", "alert": "PrometheusRuleFailures",
"annotations": { "annotations": {
@ -1254,3 +1055,44 @@ data:
} }
] ]
} }
typhoon.yaml: |-
{
"groups": [
{
"name": "general.rules",
"rules": [
{
"alert": "TargetDown",
"annotations": {
"message": "{{ printf \"%.4g\" $value }}% of the {{ $labels.job }} targets are down."
},
"expr": "100 * (count(up == 0) BY (job, namespace, service) / count(up) BY (job, namespace, service)) > 10",
"for": "10m",
"labels": {
"severity": "warning"
}
}
]
},
{
"name": "extra.rules",
"rules": [
{
"alert": "InactiveRAIDDisk",
"annotations": {
"message": "{{ $value }} RAID disk(s) on node {{ $labels.instance }} are inactive."
},
"expr": "node_md_disks - node_md_disks_active > 0",
"for": "10m",
"labels": {
"severity": "warning"
}
}
]
}
]
}
kind: ConfigMap
metadata:
name: prometheus-rules
namespace: monitoring

View File

@ -5,6 +5,7 @@ metadata:
namespace: monitoring namespace: monitoring
annotations: annotations:
prometheus.io/scrape: 'true' prometheus.io/scrape: 'true'
prometheus.io/port: '9090'
spec: spec:
type: ClusterIP type: ClusterIP
selector: selector:

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a> ## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.16.0 (upstream) * Kubernetes v1.16.3 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking * Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/cl/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization * Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/cl/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests) # Kubernetes assets (kubeconfig, manifests)
module "bootstrap" { module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=539b725093c8cd94ba46603adb25ac5280562ec8" source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=0daa1276c633fea28e41b2c2c18831e2584deb24"
cluster_name = var.cluster_name cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)] api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]

View File

@ -7,7 +7,7 @@ systemd:
- name: 40-etcd-cluster.conf - name: 40-etcd-cluster.conf
contents: | contents: |
[Service] [Service]
Environment="ETCD_IMAGE_TAG=v3.4.0" Environment="ETCD_IMAGE_TAG=v3.4.3"
Environment="ETCD_NAME=${etcd_name}" Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379" Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380" Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
@ -113,7 +113,7 @@ systemd:
--volume script,kind=host,source=/opt/bootstrap/apply \ --volume script,kind=host,source=/opt/bootstrap/apply \
--mount volume=script,target=/apply \ --mount volume=script,target=/apply \
--insecure-options=image \ --insecure-options=image \
docker://k8s.gcr.io/hyperkube:v1.16.0 \ docker://k8s.gcr.io/hyperkube:v1.16.3 \
--net=host \ --net=host \
--dns=host \ --dns=host \
--exec=/apply --exec=/apply
@ -134,7 +134,7 @@ storage:
contents: contents:
inline: | inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.16.0 KUBELET_IMAGE_TAG=v1.16.3
- path: /opt/bootstrap/apply - path: /opt/bootstrap/apply
filesystem: root filesystem: root
mode: 0544 mode: 0544

View File

@ -18,57 +18,57 @@ variable "dns_zone_id" {
# instances # instances
variable "controller_count" { variable "controller_count" {
type = string type = number
default = "1"
description = "Number of controllers (i.e. masters)" description = "Number of controllers (i.e. masters)"
default = 1
} }
variable "worker_count" { variable "worker_count" {
type = string type = number
default = "1"
description = "Number of workers" description = "Number of workers"
default = 1
} }
variable "controller_type" { variable "controller_type" {
type = string type = string
default = "t3.small"
description = "EC2 instance type for controllers" description = "EC2 instance type for controllers"
default = "t3.small"
} }
variable "worker_type" { variable "worker_type" {
type = string type = string
default = "t3.small"
description = "EC2 instance type for workers" description = "EC2 instance type for workers"
default = "t3.small"
} }
variable "os_image" { variable "os_image" {
type = string type = string
default = "coreos-stable"
description = "AMI channel for a Container Linux derivative (coreos-stable, coreos-beta, coreos-alpha, flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge)" description = "AMI channel for a Container Linux derivative (coreos-stable, coreos-beta, coreos-alpha, flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge)"
default = "coreos-stable"
} }
variable "disk_size" { variable "disk_size" {
type = string type = number
default = "40"
description = "Size of the EBS volume in GB" description = "Size of the EBS volume in GB"
default = 40
} }
variable "disk_type" { variable "disk_type" {
type = string type = string
default = "gp2"
description = "Type of the EBS volume (e.g. standard, gp2, io1)" description = "Type of the EBS volume (e.g. standard, gp2, io1)"
default = "gp2"
} }
variable "disk_iops" { variable "disk_iops" {
type = string type = number
default = "0"
description = "IOPS of the EBS volume (e.g. 100)" description = "IOPS of the EBS volume (e.g. 100)"
default = 0
} }
variable "worker_price" { variable "worker_price" {
type = string type = number
default = "" description = "Spot price in USD for worker instances or 0 to use on-demand instances"
description = "Spot price in USD for autoscaling group spot instances. Leave as default empty string for autoscaling group to use on-demand instances. Note, switching in-place from spot to on-demand is not possible: https://github.com/terraform-providers/terraform-provider-aws/issues/4320" default = 0
} }
variable "worker_target_groups" { variable "worker_target_groups" {
@ -97,60 +97,66 @@ variable "ssh_authorized_key" {
} }
variable "asset_dir" { variable "asset_dir" {
description = "Path to a directory where generated assets should be placed (contains secrets)"
type = string type = string
description = "Absolute path to a directory where generated assets should be placed (contains secrets)"
} }
variable "networking" { variable "networking" {
description = "Choice of networking provider (calico or flannel)"
type = string type = string
description = "Choice of networking provider (calico or flannel)"
default = "calico" default = "calico"
} }
variable "network_mtu" { variable "network_mtu" {
type = number
description = "CNI interface MTU (applies to calico only). Use 8981 if using instances types with Jumbo frames." description = "CNI interface MTU (applies to calico only). Use 8981 if using instances types with Jumbo frames."
type = string default = 1480
default = "1480"
} }
variable "host_cidr" { variable "host_cidr" {
description = "CIDR IPv4 range to assign to EC2 nodes"
type = string type = string
description = "CIDR IPv4 range to assign to EC2 nodes"
default = "10.0.0.0/16" default = "10.0.0.0/16"
} }
variable "pod_cidr" { variable "pod_cidr" {
description = "CIDR IPv4 range to assign Kubernetes pods"
type = string type = string
description = "CIDR IPv4 range to assign Kubernetes pods"
default = "10.2.0.0/16" default = "10.2.0.0/16"
} }
variable "service_cidr" { variable "service_cidr" {
type = string
description = <<EOD description = <<EOD
CIDR IPv4 range to assign Kubernetes services. CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns. The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns.
EOD EOD
type = string
default = "10.3.0.0/16" default = "10.3.0.0/16"
} }
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = string
default = "cluster.local"
}
variable "enable_reporting" { variable "enable_reporting" {
type = string type = bool
description = "Enable usage or analytics reporting to upstreams (Calico)" description = "Enable usage or analytics reporting to upstreams (Calico)"
default = "false" default = false
} }
variable "enable_aggregation" { variable "enable_aggregation" {
type = bool
description = "Enable the Kubernetes Aggregation Layer (defaults to false)" description = "Enable the Kubernetes Aggregation Layer (defaults to false)"
type = string default = false
default = "false" }
variable "worker_node_labels" {
type = list(string)
description = "List of initial worker node labels"
default = []
}
# unofficial, undocumented, unsupported
variable "cluster_domain_suffix" {
type = string
description = "Queries for domains with the suffix will be answered by CoreDNS. Default is cluster.local (e.g. foo.default.svc.cluster.local)"
default = "cluster.local"
} }

View File

@ -19,5 +19,6 @@ module "workers" {
service_cidr = var.service_cidr service_cidr = var.service_cidr
cluster_domain_suffix = var.cluster_domain_suffix cluster_domain_suffix = var.cluster_domain_suffix
clc_snippets = var.worker_clc_snippets clc_snippets = var.worker_clc_snippets
node_labels = var.worker_node_labels
} }

View File

@ -61,6 +61,9 @@ systemd:
--lock-file=/var/run/lock/kubelet.lock \ --lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni \ --network-plugin=cni \
--node-labels=node.kubernetes.io/node \ --node-labels=node.kubernetes.io/node \
%{ for label in split(",", node_labels) }
--node-labels=${label} \
%{ endfor ~}
--pod-manifest-path=/etc/kubernetes/manifests \ --pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \ --read-only-port=0 \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins --volume-plugin-dir=/var/lib/kubelet/volumeplugins
@ -95,7 +98,7 @@ storage:
contents: contents:
inline: | inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.16.0 KUBELET_IMAGE_TAG=v1.16.3
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
filesystem: root filesystem: root
contents: contents:
@ -113,7 +116,7 @@ storage:
--volume config,kind=host,source=/etc/kubernetes \ --volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \ --mount volume=config,target=/etc/kubernetes \
--insecure-options=image \ --insecure-options=image \
docker://k8s.gcr.io/hyperkube:v1.16.0 \ docker://k8s.gcr.io/hyperkube:v1.16.3 \
--net=host \ --net=host \
--dns=host \ --dns=host \
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname) --exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)

View File

@ -23,45 +23,45 @@ variable "security_groups" {
# instances # instances
variable "worker_count" { variable "worker_count" {
type = string type = number
default = "1"
description = "Number of instances" description = "Number of instances"
default = 1
} }
variable "instance_type" { variable "instance_type" {
type = string type = string
default = "t3.small"
description = "EC2 instance type" description = "EC2 instance type"
default = "t3.small"
} }
variable "os_image" { variable "os_image" {
type = string type = string
default = "coreos-stable"
description = "AMI channel for a Container Linux derivative (coreos-stable, coreos-beta, coreos-alpha, flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge)" description = "AMI channel for a Container Linux derivative (coreos-stable, coreos-beta, coreos-alpha, flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge)"
default = "coreos-stable"
} }
variable "disk_size" { variable "disk_size" {
type = string type = number
default = "40"
description = "Size of the EBS volume in GB" description = "Size of the EBS volume in GB"
default = 40
} }
variable "disk_type" { variable "disk_type" {
type = string type = string
default = "gp2"
description = "Type of the EBS volume (e.g. standard, gp2, io1)" description = "Type of the EBS volume (e.g. standard, gp2, io1)"
default = "gp2"
} }
variable "disk_iops" { variable "disk_iops" {
type = string type = number
default = "0"
description = "IOPS of the EBS volume (required for io1)" description = "IOPS of the EBS volume (required for io1)"
default = 0
} }
variable "spot_price" { variable "spot_price" {
type = string type = number
default = "" description = "Spot price in USD for worker instances or 0 to use on-demand instances"
description = "Spot price in USD for autoscaling group spot instances. Leave as default empty string for autoscaling group to use on-demand instances. Note, switching in-place from spot to on-demand is not possible: https://github.com/terraform-providers/terraform-provider-aws/issues/4320" default = 0
} }
variable "target_groups" { variable "target_groups" {
@ -89,19 +89,22 @@ variable "ssh_authorized_key" {
} }
variable "service_cidr" { variable "service_cidr" {
type = string
description = <<EOD description = <<EOD
CIDR IPv4 range to assign Kubernetes services. CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns. The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns.
EOD EOD
type = string
default = "10.3.0.0/16" default = "10.3.0.0/16"
} }
variable "cluster_domain_suffix" { variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = string type = string
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
default = "cluster.local" default = "cluster.local"
} }
variable "node_labels" {
type = list(string)
description = "List of initial node labels"
default = []
}

View File

@ -46,7 +46,7 @@ resource "aws_autoscaling_group" "workers" {
resource "aws_launch_configuration" "worker" { resource "aws_launch_configuration" "worker" {
image_id = local.ami_id image_id = local.ami_id
instance_type = var.instance_type instance_type = var.instance_type
spot_price = var.spot_price spot_price = var.spot_price > 0 ? var.spot_price : null
enable_monitoring = false enable_monitoring = false
user_data = data.ct_config.worker-ignition.rendered user_data = data.ct_config.worker-ignition.rendered
@ -86,6 +86,7 @@ data "template_file" "worker-config" {
cluster_dns_service_ip = cidrhost(var.service_cidr, 10) cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix cluster_domain_suffix = var.cluster_domain_suffix
cgroup_driver = local.flavor == "flatcar" && local.channel == "edge" ? "systemd" : "cgroupfs" cgroup_driver = local.flavor == "flatcar" && local.channel == "edge" ? "systemd" : "cgroupfs"
node_labels = join(",", var.node_labels)
} }
} }

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a> ## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.16.0 (upstream) * Kubernetes v1.16.3 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking * Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/cl/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization * Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/cl/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization

View File

@ -13,9 +13,11 @@ data "aws_ami" "fedora-coreos" {
values = ["hvm"] values = ["hvm"]
} }
// pin on known ok versions as preview matures
filter { filter {
name = "name" name = "name"
values = ["fedora-coreos-30.20190801.0-hvm"] values = ["fedora-coreos-30.*.*-hvm"]
} }
# try to filter out dev images (AWS filters can't)
name_regex = "^fedora-coreos-30.[0-9]*.[0-9]*-hvm*"
} }

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests) # Kubernetes assets (kubeconfig, manifests)
module "bootstrap" { module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=539b725093c8cd94ba46603adb25ac5280562ec8" source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=0daa1276c633fea28e41b2c2c18831e2584deb24"
cluster_name = var.cluster_name cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)] api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]

View File

@ -28,7 +28,7 @@ systemd:
--network host \ --network host \
--volume /var/lib/etcd:/var/lib/etcd:rw,Z \ --volume /var/lib/etcd:/var/lib/etcd:rw,Z \
--volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \ --volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \
quay.io/coreos/etcd:v3.4.0 quay.io/coreos/etcd:v3.4.3
ExecStop=/usr/bin/podman stop etcd ExecStop=/usr/bin/podman stop etcd
[Install] [Install]
WantedBy=multi-user.target WantedBy=multi-user.target
@ -80,7 +80,7 @@ systemd:
--volume /var/run:/var/run \ --volume /var/run:/var/run \
--volume /var/run/lock:/var/run/lock:z \ --volume /var/run/lock:/var/run/lock:z \
--volume /opt/cni/bin:/opt/cni/bin:z \ --volume /opt/cni/bin:/opt/cni/bin:z \
k8s.gcr.io/hyperkube:v1.16.0 /hyperkube kubelet \ k8s.gcr.io/hyperkube:v1.16.3 /hyperkube kubelet \
--anonymous-auth=false \ --anonymous-auth=false \
--authentication-token-webhook \ --authentication-token-webhook \
--authorization-mode=Webhook \ --authorization-mode=Webhook \
@ -121,7 +121,7 @@ systemd:
--network host \ --network host \
--volume /opt/bootstrap/assets:/assets:ro,Z \ --volume /opt/bootstrap/assets:/assets:ro,Z \
--volume /opt/bootstrap/apply:/apply:ro,Z \ --volume /opt/bootstrap/apply:/apply:ro,Z \
k8s.gcr.io/hyperkube:v1.16.0 \ k8s.gcr.io/hyperkube:v1.16.3 \
/apply /apply
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
ExecStartPost=-/usr/bin/podman stop bootstrap ExecStartPost=-/usr/bin/podman stop bootstrap

View File

@ -18,57 +18,57 @@ variable "dns_zone_id" {
# instances # instances
variable "controller_count" { variable "controller_count" {
type = string type = number
default = "1"
description = "Number of controllers (i.e. masters)" description = "Number of controllers (i.e. masters)"
default = 1
} }
variable "worker_count" { variable "worker_count" {
type = string type = number
default = "1"
description = "Number of workers" description = "Number of workers"
default = 1
} }
variable "controller_type" { variable "controller_type" {
type = string type = string
default = "t3.small"
description = "EC2 instance type for controllers" description = "EC2 instance type for controllers"
default = "t3.small"
} }
variable "worker_type" { variable "worker_type" {
type = string type = string
default = "t3.small"
description = "EC2 instance type for workers" description = "EC2 instance type for workers"
default = "t3.small"
} }
variable "os_image" { variable "os_image" {
type = string type = string
default = "coreos-stable"
description = "AMI channel for Fedora CoreOS (not yet used)" description = "AMI channel for Fedora CoreOS (not yet used)"
default = "coreos-stable"
} }
variable "disk_size" { variable "disk_size" {
type = string type = number
default = "40"
description = "Size of the EBS volume in GB" description = "Size of the EBS volume in GB"
default = 40
} }
variable "disk_type" { variable "disk_type" {
type = string type = string
default = "gp2"
description = "Type of the EBS volume (e.g. standard, gp2, io1)" description = "Type of the EBS volume (e.g. standard, gp2, io1)"
default = "gp2"
} }
variable "disk_iops" { variable "disk_iops" {
type = string type = number
default = "0"
description = "IOPS of the EBS volume (e.g. 100)" description = "IOPS of the EBS volume (e.g. 100)"
default = 0
} }
variable "worker_price" { variable "worker_price" {
type = string type = number
default = "" description = "Spot price in USD for worker instances or 0 to use on-demand instances"
description = "Spot price in USD for autoscaling group spot instances. Leave as default empty string for autoscaling group to use on-demand instances. Note, switching in-place from spot to on-demand is not possible: https://github.com/terraform-providers/terraform-provider-aws/issues/4320" default = 0
} }
variable "worker_target_groups" { variable "worker_target_groups" {
@ -97,60 +97,66 @@ variable "ssh_authorized_key" {
} }
variable "asset_dir" { variable "asset_dir" {
description = "Path to a directory where generated assets should be placed (contains secrets)"
type = string type = string
description = "Absolute path to a directory where generated assets should be placed (contains secrets)"
} }
variable "networking" { variable "networking" {
description = "Choice of networking provider (calico or flannel)"
type = string type = string
description = "Choice of networking provider (calico or flannel)"
default = "calico" default = "calico"
} }
variable "network_mtu" { variable "network_mtu" {
type = number
description = "CNI interface MTU (applies to calico only). Use 8981 if using instances types with Jumbo frames." description = "CNI interface MTU (applies to calico only). Use 8981 if using instances types with Jumbo frames."
type = string default = 1480
default = "1480"
} }
variable "host_cidr" { variable "host_cidr" {
description = "CIDR IPv4 range to assign to EC2 nodes"
type = string type = string
description = "CIDR IPv4 range to assign to EC2 nodes"
default = "10.0.0.0/16" default = "10.0.0.0/16"
} }
variable "pod_cidr" { variable "pod_cidr" {
description = "CIDR IPv4 range to assign Kubernetes pods"
type = string type = string
description = "CIDR IPv4 range to assign Kubernetes pods"
default = "10.2.0.0/16" default = "10.2.0.0/16"
} }
variable "service_cidr" { variable "service_cidr" {
type = string
description = <<EOD description = <<EOD
CIDR IPv4 range to assign Kubernetes services. CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns. The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns.
EOD EOD
type = string
default = "10.3.0.0/16" default = "10.3.0.0/16"
} }
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = string
default = "cluster.local"
}
variable "enable_reporting" { variable "enable_reporting" {
type = string type = bool
description = "Enable usage or analytics reporting to upstreams (Calico)" description = "Enable usage or analytics reporting to upstreams (Calico)"
default = "false" default = false
} }
variable "enable_aggregation" { variable "enable_aggregation" {
type = bool
description = "Enable the Kubernetes Aggregation Layer (defaults to false)" description = "Enable the Kubernetes Aggregation Layer (defaults to false)"
type = string default = false
default = "false" }
variable "worker_node_labels" {
type = list(string)
description = "List of initial worker node labels"
default = []
}
# unofficial, undocumented, unsupported
variable "cluster_domain_suffix" {
type = string
description = "Queries for domains with the suffix will be answered by CoreDNS. Default is cluster.local (e.g. foo.default.svc.cluster.local)"
default = "cluster.local"
} }

View File

@ -19,5 +19,6 @@ module "workers" {
service_cidr = var.service_cidr service_cidr = var.service_cidr
cluster_domain_suffix = var.cluster_domain_suffix cluster_domain_suffix = var.cluster_domain_suffix
snippets = var.worker_snippets snippets = var.worker_snippets
node_labels = var.worker_node_labels
} }

View File

@ -13,9 +13,11 @@ data "aws_ami" "fedora-coreos" {
values = ["hvm"] values = ["hvm"]
} }
// pin on known ok versions as preview matures
filter { filter {
name = "name" name = "name"
values = ["fedora-coreos-30.20190801.0-hvm"] values = ["fedora-coreos-30.*.*-hvm"]
} }
# try to filter out dev images (AWS filters can't)
name_regex = "^fedora-coreos-30.[0-9]*.[0-9]*-hvm*"
} }

View File

@ -50,7 +50,7 @@ systemd:
--volume /var/run:/var/run \ --volume /var/run:/var/run \
--volume /var/run/lock:/var/run/lock:z \ --volume /var/run/lock:/var/run/lock:z \
--volume /opt/cni/bin:/opt/cni/bin:z \ --volume /opt/cni/bin:/opt/cni/bin:z \
k8s.gcr.io/hyperkube:v1.16.0 /hyperkube kubelet \ k8s.gcr.io/hyperkube:v1.16.3 /hyperkube kubelet \
--anonymous-auth=false \ --anonymous-auth=false \
--authentication-token-webhook \ --authentication-token-webhook \
--authorization-mode=Webhook \ --authorization-mode=Webhook \
@ -66,6 +66,9 @@ systemd:
--lock-file=/var/run/lock/kubelet.lock \ --lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni \ --network-plugin=cni \
--node-labels=node.kubernetes.io/node \ --node-labels=node.kubernetes.io/node \
%{ for label in split(",", node_labels) }
--node-labels=${label} \
%{ endfor ~}
--pod-manifest-path=/etc/kubernetes/manifests \ --pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \ --read-only-port=0 \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins --volume-plugin-dir=/var/lib/kubelet/volumeplugins

View File

@ -23,45 +23,45 @@ variable "security_groups" {
# instances # instances
variable "worker_count" { variable "worker_count" {
type = string type = number
default = "1"
description = "Number of instances" description = "Number of instances"
default = 1
} }
variable "instance_type" { variable "instance_type" {
type = string type = string
default = "t3.small"
description = "EC2 instance type" description = "EC2 instance type"
default = "t3.small"
} }
variable "os_image" { variable "os_image" {
type = string type = string
default = "coreos-stable"
description = "AMI channel for Fedora CoreOS (not yet used)" description = "AMI channel for Fedora CoreOS (not yet used)"
default = "coreos-stable"
} }
variable "disk_size" { variable "disk_size" {
type = string type = number
default = "40"
description = "Size of the EBS volume in GB" description = "Size of the EBS volume in GB"
default = 40
} }
variable "disk_type" { variable "disk_type" {
type = string type = string
default = "gp2"
description = "Type of the EBS volume (e.g. standard, gp2, io1)" description = "Type of the EBS volume (e.g. standard, gp2, io1)"
default = "gp2"
} }
variable "disk_iops" { variable "disk_iops" {
type = string type = number
default = "0"
description = "IOPS of the EBS volume (required for io1)" description = "IOPS of the EBS volume (required for io1)"
default = 0
} }
variable "spot_price" { variable "spot_price" {
type = string type = number
default = "" description = "Spot price in USD for worker instances or 0 to use on-demand instances"
description = "Spot price in USD for autoscaling group spot instances. Leave as default empty string for autoscaling group to use on-demand instances. Note, switching in-place from spot to on-demand is not possible: https://github.com/terraform-providers/terraform-provider-aws/issues/4320" default = 0
} }
variable "target_groups" { variable "target_groups" {
@ -89,19 +89,22 @@ variable "ssh_authorized_key" {
} }
variable "service_cidr" { variable "service_cidr" {
type = string
description = <<EOD description = <<EOD
CIDR IPv4 range to assign Kubernetes services. CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns. The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns.
EOD EOD
type = string
default = "10.3.0.0/16" default = "10.3.0.0/16"
} }
variable "cluster_domain_suffix" { variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = string type = string
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
default = "cluster.local" default = "cluster.local"
} }
variable "node_labels" {
type = list(string)
description = "List of initial node labels"
default = []
}

View File

@ -46,7 +46,7 @@ resource "aws_autoscaling_group" "workers" {
resource "aws_launch_configuration" "worker" { resource "aws_launch_configuration" "worker" {
image_id = data.aws_ami.fedora-coreos.image_id image_id = data.aws_ami.fedora-coreos.image_id
instance_type = var.instance_type instance_type = var.instance_type
spot_price = var.spot_price spot_price = var.spot_price > 0 ? var.spot_price : null
enable_monitoring = false enable_monitoring = false
user_data = data.ct_config.worker-ignition.rendered user_data = data.ct_config.worker-ignition.rendered
@ -85,6 +85,7 @@ data "template_file" "worker-config" {
ssh_authorized_key = var.ssh_authorized_key ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10) cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix cluster_domain_suffix = var.cluster_domain_suffix
node_labels = join(",", var.node_labels)
} }
} }

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a> ## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.16.0 (upstream) * Kubernetes v1.16.3 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking * Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [low-priority](https://typhoon.psdn.io/cl/azure/#low-priority) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization * Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [low-priority](https://typhoon.psdn.io/cl/azure/#low-priority) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests) # Kubernetes assets (kubeconfig, manifests)
module "bootstrap" { module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=539b725093c8cd94ba46603adb25ac5280562ec8" source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=0daa1276c633fea28e41b2c2c18831e2584deb24"
cluster_name = var.cluster_name cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)] api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]

View File

@ -7,7 +7,7 @@ systemd:
- name: 40-etcd-cluster.conf - name: 40-etcd-cluster.conf
contents: | contents: |
[Service] [Service]
Environment="ETCD_IMAGE_TAG=v3.4.0" Environment="ETCD_IMAGE_TAG=v3.4.3"
Environment="ETCD_NAME=${etcd_name}" Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379" Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380" Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
@ -111,7 +111,7 @@ systemd:
--volume script,kind=host,source=/opt/bootstrap/apply \ --volume script,kind=host,source=/opt/bootstrap/apply \
--mount volume=script,target=/apply \ --mount volume=script,target=/apply \
--insecure-options=image \ --insecure-options=image \
docker://k8s.gcr.io/hyperkube:v1.16.0 \ docker://k8s.gcr.io/hyperkube:v1.16.3 \
--net=host \ --net=host \
--dns=host \ --dns=host \
--exec=/apply --exec=/apply
@ -132,7 +132,7 @@ storage:
contents: contents:
inline: | inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.16.0 KUBELET_IMAGE_TAG=v1.16.3
- path: /opt/bootstrap/apply - path: /opt/bootstrap/apply
filesystem: root filesystem: root
mode: 0544 mode: 0544

View File

@ -19,6 +19,10 @@ output "resource_group_name" {
value = azurerm_resource_group.cluster.name value = azurerm_resource_group.cluster.name
} }
output "resource_group_id" {
value = azurerm_resource_group.cluster.id
}
output "subnet_id" { output "subnet_id" {
value = azurerm_subnet.worker.id value = azurerm_subnet.worker.id
} }
@ -53,4 +57,3 @@ output "backend_address_pool_id" {
description = "ID of the worker backend address pool" description = "ID of the worker backend address pool"
value = azurerm_lb_backend_address_pool.worker.id value = azurerm_lb_backend_address_pool.worker.id
} }

View File

@ -23,27 +23,27 @@ variable "dns_zone_group" {
# instances # instances
variable "controller_count" { variable "controller_count" {
type = string type = number
default = "1"
description = "Number of controllers (i.e. masters)" description = "Number of controllers (i.e. masters)"
default = 1
} }
variable "worker_count" { variable "worker_count" {
type = string type = number
default = "1"
description = "Number of workers" description = "Number of workers"
default = 1
} }
variable "controller_type" { variable "controller_type" {
type = string type = string
default = "Standard_B2s"
description = "Machine type for controllers (see `az vm list-skus --location centralus`)" description = "Machine type for controllers (see `az vm list-skus --location centralus`)"
default = "Standard_B2s"
} }
variable "worker_type" { variable "worker_type" {
type = string type = string
default = "Standard_DS1_v2"
description = "Machine type for workers (see `az vm list-skus --location centralus`)" description = "Machine type for workers (see `az vm list-skus --location centralus`)"
default = "Standard_DS1_v2"
} }
variable "os_image" { variable "os_image" {
@ -53,15 +53,15 @@ variable "os_image" {
} }
variable "disk_size" { variable "disk_size" {
type = string type = number
default = "40"
description = "Size of the disk in GB" description = "Size of the disk in GB"
default = 40
} }
variable "worker_priority" { variable "worker_priority" {
type = string type = string
default = "Regular"
description = "Set worker priority to Low to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time." description = "Set worker priority to Low to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time."
default = "Regular"
} }
variable "controller_clc_snippets" { variable "controller_clc_snippets" {
@ -84,54 +84,60 @@ variable "ssh_authorized_key" {
} }
variable "asset_dir" { variable "asset_dir" {
description = "Path to a directory where generated assets should be placed (contains secrets)"
type = string type = string
description = "Absolute path to a directory where generated assets should be placed (contains secrets)"
} }
variable "networking" { variable "networking" {
description = "Choice of networking provider (flannel or calico)"
type = string type = string
default = "flannel" description = "Choice of networking provider (flannel or calico)"
default = "calico"
} }
variable "host_cidr" { variable "host_cidr" {
description = "CIDR IPv4 range to assign to instances"
type = string type = string
description = "CIDR IPv4 range to assign to instances"
default = "10.0.0.0/16" default = "10.0.0.0/16"
} }
variable "pod_cidr" { variable "pod_cidr" {
description = "CIDR IPv4 range to assign Kubernetes pods"
type = string type = string
description = "CIDR IPv4 range to assign Kubernetes pods"
default = "10.2.0.0/16" default = "10.2.0.0/16"
} }
variable "service_cidr" { variable "service_cidr" {
type = string
description = <<EOD description = <<EOD
CIDR IPv4 range to assign Kubernetes services. CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns. The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns.
EOD EOD
type = string
default = "10.3.0.0/16" default = "10.3.0.0/16"
} }
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = string
default = "cluster.local"
}
variable "enable_reporting" { variable "enable_reporting" {
type = string type = bool
description = "Enable usage or analytics reporting to upstreams (Calico)" description = "Enable usage or analytics reporting to upstreams (Calico)"
default = "false" default = false
} }
variable "enable_aggregation" { variable "enable_aggregation" {
type = bool
description = "Enable the Kubernetes Aggregation Layer (defaults to false)" description = "Enable the Kubernetes Aggregation Layer (defaults to false)"
type = string default = false
default = "false" }
variable "worker_node_labels" {
type = list(string)
description = "List of initial worker node labels"
default = []
}
# unofficial, undocumented, unsupported
variable "cluster_domain_suffix" {
type = string
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
default = "cluster.local"
} }

View File

@ -20,5 +20,6 @@ module "workers" {
service_cidr = var.service_cidr service_cidr = var.service_cidr
cluster_domain_suffix = var.cluster_domain_suffix cluster_domain_suffix = var.cluster_domain_suffix
clc_snippets = var.worker_clc_snippets clc_snippets = var.worker_clc_snippets
node_labels = var.worker_node_labels
} }

View File

@ -59,6 +59,9 @@ systemd:
--lock-file=/var/run/lock/kubelet.lock \ --lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni \ --network-plugin=cni \
--node-labels=node.kubernetes.io/node \ --node-labels=node.kubernetes.io/node \
%{ for label in split(",", node_labels) }
--node-labels=${label} \
%{ endfor ~}
--pod-manifest-path=/etc/kubernetes/manifests \ --pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \ --read-only-port=0 \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins --volume-plugin-dir=/var/lib/kubelet/volumeplugins
@ -93,7 +96,7 @@ storage:
contents: contents:
inline: | inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.16.0 KUBELET_IMAGE_TAG=v1.16.3
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
filesystem: root filesystem: root
contents: contents:
@ -111,7 +114,7 @@ storage:
--volume config,kind=host,source=/etc/kubernetes \ --volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \ --mount volume=config,target=/etc/kubernetes \
--insecure-options=image \ --insecure-options=image \
docker://k8s.gcr.io/hyperkube:v1.16.0 \ docker://k8s.gcr.io/hyperkube:v1.16.3 \
--net=host \ --net=host \
--dns=host \ --dns=host \
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname | tr '[:upper:]' '[:lower:]') --exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname | tr '[:upper:]' '[:lower:]')

View File

@ -33,27 +33,27 @@ variable "backend_address_pool_id" {
# instances # instances
variable "worker_count" { variable "worker_count" {
type = string type = number
default = "1"
description = "Number of instances" description = "Number of instances"
default = 1
} }
variable "vm_type" { variable "vm_type" {
type = string type = string
default = "Standard_F1"
description = "Machine type for instances (see `az vm list-skus --location centralus`)" description = "Machine type for instances (see `az vm list-skus --location centralus`)"
default = "Standard_DS1_v2"
} }
variable "os_image" { variable "os_image" {
type = string type = string
default = "coreos-stable"
description = "Channel for a Container Linux derivative (coreos-stable, coreos-beta, coreos-alpha)" description = "Channel for a Container Linux derivative (coreos-stable, coreos-beta, coreos-alpha)"
default = "coreos-stable"
} }
variable "priority" { variable "priority" {
type = string type = string
default = "Regular"
description = "Set priority to Low to use reduced cost surplus capacity, with the tradeoff that instances can be evicted at any time." description = "Set priority to Low to use reduced cost surplus capacity, with the tradeoff that instances can be evicted at any time."
default = "Regular"
} }
variable "clc_snippets" { variable "clc_snippets" {
@ -75,16 +75,22 @@ variable "ssh_authorized_key" {
} }
variable "service_cidr" { variable "service_cidr" {
type = string
description = <<EOD description = <<EOD
CIDR IPv4 range to assign Kubernetes services. CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns. The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns.
EOD EOD
type = string
default = "10.3.0.0/16" default = "10.3.0.0/16"
} }
variable "node_labels" {
type = list(string)
description = "List of initial node labels"
default = []
}
# unofficial, undocumented, unsupported
variable "cluster_domain_suffix" { variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) " description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = string type = string

View File

@ -111,6 +111,7 @@ data "template_file" "worker-config" {
ssh_authorized_key = var.ssh_authorized_key ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10) cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix cluster_domain_suffix = var.cluster_domain_suffix
node_labels = join(",", var.node_labels)
} }
} }

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a> ## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.16.0 (upstream) * Kubernetes v1.16.3 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking * Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization * Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization

View File

@ -1,10 +1,10 @@
# Kubernetes assets (kubeconfig, manifests) # Kubernetes assets (kubeconfig, manifests)
module "bootstrap" { module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=539b725093c8cd94ba46603adb25ac5280562ec8" source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=0daa1276c633fea28e41b2c2c18831e2584deb24"
cluster_name = var.cluster_name cluster_name = var.cluster_name
api_servers = [var.k8s_domain_name] api_servers = [var.k8s_domain_name]
etcd_servers = var.controller_domains etcd_servers = var.controllers.*.domain
asset_dir = var.asset_dir asset_dir = var.asset_dir
networking = var.networking networking = var.networking
network_mtu = var.network_mtu network_mtu = var.network_mtu

View File

@ -7,7 +7,7 @@ systemd:
- name: 40-etcd-cluster.conf - name: 40-etcd-cluster.conf
contents: | contents: |
[Service] [Service]
Environment="ETCD_IMAGE_TAG=v3.4.0" Environment="ETCD_IMAGE_TAG=v3.4.3"
Environment="ETCD_NAME=${etcd_name}" Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${domain_name}:2379" Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${domain_name}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${domain_name}:2380" Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${domain_name}:2380"
@ -126,7 +126,7 @@ systemd:
--volume script,kind=host,source=/opt/bootstrap/apply \ --volume script,kind=host,source=/opt/bootstrap/apply \
--mount volume=script,target=/apply \ --mount volume=script,target=/apply \
--insecure-options=image \ --insecure-options=image \
docker://k8s.gcr.io/hyperkube:v1.16.0 \ docker://k8s.gcr.io/hyperkube:v1.16.3 \
--net=host \ --net=host \
--dns=host \ --dns=host \
--exec=/apply --exec=/apply
@ -141,7 +141,7 @@ storage:
contents: contents:
inline: | inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.16.0 KUBELET_IMAGE_TAG=v1.16.3
- path: /etc/hostname - path: /etc/hostname
filesystem: root filesystem: root
mode: 0644 mode: 0644

View File

@ -35,7 +35,6 @@ storage:
-d ${install_disk} \ -d ${install_disk} \
-C ${os_channel} \ -C ${os_channel} \
-V ${os_version} \ -V ${os_version} \
-o "${container_linux_oem}" \
${baseurl_flag} \ ${baseurl_flag} \
-i ignition.json -i ignition.json
udevadm settle udevadm settle

View File

@ -91,7 +91,7 @@ storage:
contents: contents:
inline: | inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.16.0 KUBELET_IMAGE_TAG=v1.16.3
- path: /etc/hostname - path: /etc/hostname
filesystem: root filesystem: root
mode: 0644 mode: 0644

View File

@ -1,34 +1,34 @@
resource "matchbox_group" "install" { resource "matchbox_group" "install" {
count = length(var.controller_names) + length(var.worker_names) count = length(var.controllers) + length(var.workers)
name = format("install-%s", element(concat(var.controller_names, var.worker_names), count.index)) name = format("install-%s", concat(var.controllers.*.name, var.workers.*.name)[count.index])
# pick one of 4 Matchbox profiles (Container Linux or Flatcar, cached or non-cached) # pick one of 4 Matchbox profiles (Container Linux or Flatcar, cached or non-cached)
profile = local.flavor == "flatcar" ? var.cached_install == "true" ? element(matchbox_profile.cached-flatcar-linux-install.*.name, count.index) : element(matchbox_profile.flatcar-install.*.name, count.index) : var.cached_install == "true" ? element(matchbox_profile.cached-container-linux-install.*.name, count.index) : element(matchbox_profile.container-linux-install.*.name, count.index) profile = local.flavor == "flatcar" ? var.cached_install ? matchbox_profile.cached-flatcar-linux-install.*.name[count.index] : matchbox_profile.flatcar-install.*.name[count.index] : var.cached_install ? matchbox_profile.cached-container-linux-install.*.name[count.index] : matchbox_profile.container-linux-install.*.name[count.index]
selector = { selector = {
mac = element(concat(var.controller_macs, var.worker_macs), count.index) mac = concat(var.controllers.*.mac, var.workers.*.mac)[count.index]
} }
} }
resource "matchbox_group" "controller" { resource "matchbox_group" "controller" {
count = length(var.controller_names) count = length(var.controllers)
name = format("%s-%s", var.cluster_name, element(var.controller_names, count.index)) name = format("%s-%s", var.cluster_name, var.controllers[count.index].name)
profile = element(matchbox_profile.controllers.*.name, count.index) profile = matchbox_profile.controllers.*.name[count.index]
selector = { selector = {
mac = element(var.controller_macs, count.index) mac = var.controllers[count.index].mac
os = "installed" os = "installed"
} }
} }
resource "matchbox_group" "worker" { resource "matchbox_group" "worker" {
count = length(var.worker_names) count = length(var.workers)
name = format("%s-%s", var.cluster_name, element(var.worker_names, count.index)) name = format("%s-%s", var.cluster_name, var.workers[count.index].name)
profile = element(matchbox_profile.workers.*.name, count.index) profile = matchbox_profile.workers.*.name[count.index]
selector = { selector = {
mac = element(var.worker_macs, count.index) mac = var.workers[count.index].mac
os = "installed" os = "installed"
} }
} }

View File

@ -1,15 +1,14 @@
locals { locals {
# coreos-stable -> coreos flavor, stable channel # coreos-stable -> coreos flavor, stable channel
# flatcar-stable -> flatcar flavor, stable channel # flatcar-stable -> flatcar flavor, stable channel
flavor = element(split("-", var.os_channel), 0) flavor = split("-", var.os_channel)[0]
channel = split("-", var.os_channel)[1]
channel = element(split("-", var.os_channel), 1)
} }
// Container Linux Install profile (from release.core-os.net) // Container Linux Install profile (from release.core-os.net)
resource "matchbox_profile" "container-linux-install" { resource "matchbox_profile" "container-linux-install" {
count = length(var.controller_names) + length(var.worker_names) count = length(var.controllers) + length(var.workers)
name = format("%s-container-linux-install-%s", var.cluster_name, element(concat(var.controller_names, var.worker_names), count.index)) name = format("%s-container-linux-install-%s", var.cluster_name, concat(var.controllers.*.name, var.workers.*.name)[count.index])
kernel = "${var.download_protocol}://${local.channel}.release.core-os.net/amd64-usr/${var.os_version}/coreos_production_pxe.vmlinuz" kernel = "${var.download_protocol}://${local.channel}.release.core-os.net/amd64-usr/${var.os_version}/coreos_production_pxe.vmlinuz"
@ -26,11 +25,11 @@ resource "matchbox_profile" "container-linux-install" {
var.kernel_args, var.kernel_args,
]) ])
container_linux_config = element(data.template_file.container-linux-install-configs.*.rendered, count.index) container_linux_config = data.template_file.container-linux-install-configs.*.rendered[count.index]
} }
data "template_file" "container-linux-install-configs" { data "template_file" "container-linux-install-configs" {
count = length(var.controller_names) + length(var.worker_names) count = length(var.controllers) + length(var.workers)
template = file("${path.module}/cl/install.yaml.tmpl") template = file("${path.module}/cl/install.yaml.tmpl")
@ -40,7 +39,6 @@ data "template_file" "container-linux-install-configs" {
os_version = var.os_version os_version = var.os_version
ignition_endpoint = format("%s/ignition", var.matchbox_http_endpoint) ignition_endpoint = format("%s/ignition", var.matchbox_http_endpoint)
install_disk = var.install_disk install_disk = var.install_disk
container_linux_oem = var.container_linux_oem
ssh_authorized_key = var.ssh_authorized_key ssh_authorized_key = var.ssh_authorized_key
# only cached-container-linux profile adds -b baseurl # only cached-container-linux profile adds -b baseurl
baseurl_flag = "" baseurl_flag = ""
@ -50,8 +48,8 @@ data "template_file" "container-linux-install-configs" {
// Container Linux Install profile (from matchbox /assets cache) // Container Linux Install profile (from matchbox /assets cache)
// Note: Admin must have downloaded os_version into matchbox assets/coreos. // Note: Admin must have downloaded os_version into matchbox assets/coreos.
resource "matchbox_profile" "cached-container-linux-install" { resource "matchbox_profile" "cached-container-linux-install" {
count = length(var.controller_names) + length(var.worker_names) count = length(var.controllers) + length(var.workers)
name = format("%s-cached-container-linux-install-%s", var.cluster_name, element(concat(var.controller_names, var.worker_names), count.index)) name = format("%s-cached-container-linux-install-%s", var.cluster_name, concat(var.controllers.*.name, var.workers.*.name)[count.index])
kernel = "/assets/coreos/${var.os_version}/coreos_production_pxe.vmlinuz" kernel = "/assets/coreos/${var.os_version}/coreos_production_pxe.vmlinuz"
@ -68,11 +66,11 @@ resource "matchbox_profile" "cached-container-linux-install" {
var.kernel_args, var.kernel_args,
]) ])
container_linux_config = element(data.template_file.cached-container-linux-install-configs.*.rendered, count.index) container_linux_config = data.template_file.cached-container-linux-install-configs.*.rendered[count.index]
} }
data "template_file" "cached-container-linux-install-configs" { data "template_file" "cached-container-linux-install-configs" {
count = length(var.controller_names) + length(var.worker_names) count = length(var.controllers) + length(var.workers)
template = file("${path.module}/cl/install.yaml.tmpl") template = file("${path.module}/cl/install.yaml.tmpl")
@ -82,7 +80,6 @@ data "template_file" "cached-container-linux-install-configs" {
os_version = var.os_version os_version = var.os_version
ignition_endpoint = format("%s/ignition", var.matchbox_http_endpoint) ignition_endpoint = format("%s/ignition", var.matchbox_http_endpoint)
install_disk = var.install_disk install_disk = var.install_disk
container_linux_oem = var.container_linux_oem
ssh_authorized_key = var.ssh_authorized_key ssh_authorized_key = var.ssh_authorized_key
# profile uses -b baseurl to install from matchbox cache # profile uses -b baseurl to install from matchbox cache
baseurl_flag = "-b ${var.matchbox_http_endpoint}/assets/${local.flavor}" baseurl_flag = "-b ${var.matchbox_http_endpoint}/assets/${local.flavor}"
@ -91,8 +88,8 @@ data "template_file" "cached-container-linux-install-configs" {
// Flatcar Linux install profile (from release.flatcar-linux.net) // Flatcar Linux install profile (from release.flatcar-linux.net)
resource "matchbox_profile" "flatcar-install" { resource "matchbox_profile" "flatcar-install" {
count = length(var.controller_names) + length(var.worker_names) count = length(var.controllers) + length(var.workers)
name = format("%s-flatcar-install-%s", var.cluster_name, element(concat(var.controller_names, var.worker_names), count.index)) name = format("%s-flatcar-install-%s", var.cluster_name, concat(var.controllers.*.name, var.workers.*.name)[count.index])
kernel = "${var.download_protocol}://${local.channel}.release.flatcar-linux.net/amd64-usr/${var.os_version}/flatcar_production_pxe.vmlinuz" kernel = "${var.download_protocol}://${local.channel}.release.flatcar-linux.net/amd64-usr/${var.os_version}/flatcar_production_pxe.vmlinuz"
@ -109,14 +106,14 @@ resource "matchbox_profile" "flatcar-install" {
var.kernel_args, var.kernel_args,
]) ])
container_linux_config = element(data.template_file.container-linux-install-configs.*.rendered, count.index) container_linux_config = data.template_file.container-linux-install-configs.*.rendered[count.index]
} }
// Flatcar Linux Install profile (from matchbox /assets cache) // Flatcar Linux Install profile (from matchbox /assets cache)
// Note: Admin must have downloaded os_version into matchbox assets/flatcar. // Note: Admin must have downloaded os_version into matchbox assets/flatcar.
resource "matchbox_profile" "cached-flatcar-linux-install" { resource "matchbox_profile" "cached-flatcar-linux-install" {
count = length(var.controller_names) + length(var.worker_names) count = length(var.controllers) + length(var.workers)
name = format("%s-cached-flatcar-linux-install-%s", var.cluster_name, element(concat(var.controller_names, var.worker_names), count.index)) name = format("%s-cached-flatcar-linux-install-%s", var.cluster_name, concat(var.controllers.*.name, var.workers.*.name)[count.index])
kernel = "/assets/flatcar/${var.os_version}/flatcar_production_pxe.vmlinuz" kernel = "/assets/flatcar/${var.os_version}/flatcar_production_pxe.vmlinuz"
@ -133,32 +130,32 @@ resource "matchbox_profile" "cached-flatcar-linux-install" {
var.kernel_args, var.kernel_args,
]) ])
container_linux_config = element(data.template_file.cached-container-linux-install-configs.*.rendered, count.index) container_linux_config = data.template_file.cached-container-linux-install-configs.*.rendered[count.index]
} }
// Kubernetes Controller profiles // Kubernetes Controller profiles
resource "matchbox_profile" "controllers" { resource "matchbox_profile" "controllers" {
count = length(var.controller_names) count = length(var.controllers)
name = format("%s-controller-%s", var.cluster_name, element(var.controller_names, count.index)) name = format("%s-controller-%s", var.cluster_name, var.controllers.*.name[count.index])
raw_ignition = element(data.ct_config.controller-ignitions.*.rendered, count.index) raw_ignition = data.ct_config.controller-ignitions.*.rendered[count.index]
} }
data "ct_config" "controller-ignitions" { data "ct_config" "controller-ignitions" {
count = length(var.controller_names) count = length(var.controllers)
content = element(data.template_file.controller-configs.*.rendered, count.index) content = data.template_file.controller-configs.*.rendered[count.index]
pretty_print = false pretty_print = false
snippets = local.clc_map[element(var.controller_names, count.index)] snippets = local.clc_map[var.controllers.*.name[count.index]]
} }
data "template_file" "controller-configs" { data "template_file" "controller-configs" {
count = length(var.controller_names) count = length(var.controllers)
template = file("${path.module}/cl/controller.yaml.tmpl") template = file("${path.module}/cl/controller.yaml.tmpl")
vars = { vars = {
domain_name = element(var.controller_domains, count.index) domain_name = var.controllers.*.domain[count.index]
etcd_name = element(var.controller_names, count.index) etcd_name = var.controllers.*.name[count.index]
etcd_initial_cluster = join(",", formatlist("%s=https://%s:2380", var.controller_names, var.controller_domains)) etcd_initial_cluster = join(",", formatlist("%s=https://%s:2380", var.controllers.*.name, var.controllers.*.domain))
cgroup_driver = var.os_channel == "flatcar-edge" ? "systemd" : "cgroupfs" cgroup_driver = var.os_channel == "flatcar-edge" ? "systemd" : "cgroupfs"
cluster_dns_service_ip = module.bootstrap.cluster_dns_service_ip cluster_dns_service_ip = module.bootstrap.cluster_dns_service_ip
cluster_domain_suffix = var.cluster_domain_suffix cluster_domain_suffix = var.cluster_domain_suffix
@ -168,25 +165,25 @@ data "template_file" "controller-configs" {
// Kubernetes Worker profiles // Kubernetes Worker profiles
resource "matchbox_profile" "workers" { resource "matchbox_profile" "workers" {
count = length(var.worker_names) count = length(var.workers)
name = format("%s-worker-%s", var.cluster_name, element(var.worker_names, count.index)) name = format("%s-worker-%s", var.cluster_name, var.workers.*.name[count.index])
raw_ignition = element(data.ct_config.worker-ignitions.*.rendered, count.index) raw_ignition = data.ct_config.worker-ignitions.*.rendered[count.index]
} }
data "ct_config" "worker-ignitions" { data "ct_config" "worker-ignitions" {
count = length(var.worker_names) count = length(var.workers)
content = element(data.template_file.worker-configs.*.rendered, count.index) content = data.template_file.worker-configs.*.rendered[count.index]
pretty_print = false pretty_print = false
snippets = local.clc_map[element(var.worker_names, count.index)] snippets = local.clc_map[var.workers.*.name[count.index]]
} }
data "template_file" "worker-configs" { data "template_file" "worker-configs" {
count = length(var.worker_names) count = length(var.workers)
template = file("${path.module}/cl/worker.yaml.tmpl") template = file("${path.module}/cl/worker.yaml.tmpl")
vars = { vars = {
domain_name = element(var.worker_domains, count.index) domain_name = var.workers.*.domain[count.index]
cgroup_driver = var.os_channel == "flatcar-edge" ? "systemd" : "cgroupfs" cgroup_driver = var.os_channel == "flatcar-edge" ? "systemd" : "cgroupfs"
cluster_dns_service_ip = module.bootstrap.cluster_dns_service_ip cluster_dns_service_ip = module.bootstrap.cluster_dns_service_ip
cluster_domain_suffix = var.cluster_domain_suffix cluster_domain_suffix = var.cluster_domain_suffix
@ -200,7 +197,7 @@ locals {
# Default Container Linux config snippets map every node names to list("\n") so # Default Container Linux config snippets map every node names to list("\n") so
# all lookups succeed # all lookups succeed
clc_defaults = zipmap( clc_defaults = zipmap(
concat(var.controller_names, var.worker_names), concat(var.controllers.*.name, var.workers.*.name),
chunklist(data.template_file.clc-default-snippets.*.rendered, 1), chunklist(data.template_file.clc-default-snippets.*.rendered, 1),
) )
@ -210,7 +207,7 @@ locals {
// Horrible hack to generate a Terraform list of node count length // Horrible hack to generate a Terraform list of node count length
data "template_file" "clc-default-snippets" { data "template_file" "clc-default-snippets" {
count = length(var.controller_names) + length(var.worker_names) count = length(var.controllers) + length(var.workers)
template = "\n" template = "\n"
} }

View File

@ -1,6 +1,6 @@
# Secure copy assets to controllers. Activates kubelet.service # Secure copy assets to controllers. Activates kubelet.service
resource "null_resource" "copy-controller-secrets" { resource "null_resource" "copy-controller-secrets" {
count = length(var.controller_names) count = length(var.controllers)
# Without depends_on, remote-exec could start and wait for machines before # Without depends_on, remote-exec could start and wait for machines before
# matchbox groups are written, causing a deadlock. # matchbox groups are written, causing a deadlock.
@ -13,7 +13,7 @@ resource "null_resource" "copy-controller-secrets" {
connection { connection {
type = "ssh" type = "ssh"
host = var.controller_domains[count.index] host = var.controllers.*.domain[count.index]
user = "core" user = "core"
timeout = "60m" timeout = "60m"
} }
@ -76,7 +76,7 @@ resource "null_resource" "copy-controller-secrets" {
"sudo chown -R etcd:etcd /etc/ssl/etcd", "sudo chown -R etcd:etcd /etc/ssl/etcd",
"sudo chmod -R 500 /etc/ssl/etcd", "sudo chmod -R 500 /etc/ssl/etcd",
"sudo mv $HOME/assets /opt/bootstrap/assets", "sudo mv $HOME/assets /opt/bootstrap/assets",
"sudo mkdir -p /etc/kubernetes/manifests" "sudo mkdir -p /etc/kubernetes/manifests",
"sudo mkdir -p /etc/kubernetes/bootstrap-secrets", "sudo mkdir -p /etc/kubernetes/bootstrap-secrets",
"sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig", "sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
"sudo cp -r /opt/bootstrap/assets/tls/* /etc/kubernetes/bootstrap-secrets/", "sudo cp -r /opt/bootstrap/assets/tls/* /etc/kubernetes/bootstrap-secrets/",
@ -88,7 +88,7 @@ resource "null_resource" "copy-controller-secrets" {
# Secure copy kubeconfig to all workers. Activates kubelet.service # Secure copy kubeconfig to all workers. Activates kubelet.service
resource "null_resource" "copy-worker-secrets" { resource "null_resource" "copy-worker-secrets" {
count = length(var.worker_names) count = length(var.workers)
# Without depends_on, remote-exec could start and wait for machines before # Without depends_on, remote-exec could start and wait for machines before
# matchbox groups are written, causing a deadlock. # matchbox groups are written, causing a deadlock.
@ -100,7 +100,7 @@ resource "null_resource" "copy-worker-secrets" {
connection { connection {
type = "ssh" type = "ssh"
host = var.worker_domains[count.index] host = var.workers.*.domain[count.index]
user = "core" user = "core"
timeout = "60m" timeout = "60m"
} }
@ -129,7 +129,7 @@ resource "null_resource" "bootstrap" {
connection { connection {
type = "ssh" type = "ssh"
host = var.controller_domains[0] host = var.controllers[0].domain
user = "core" user = "core"
timeout = "15m" timeout = "15m"
} }

View File

@ -21,36 +21,32 @@ variable "os_version" {
} }
# machines # machines
# Terraform's crude "type system" does not properly support lists of maps so we do this.
variable "controller_names" { variable "controllers" {
type = list(string) type = list(object({
description = "Ordered list of controller names (e.g. [node1])" name = string
mac = string
domain = string
}))
description = <<EOD
List of controller machine details (unique name, identifying MAC address, FQDN)
[{ name = "node1", mac = "52:54:00:a1:9c:ae", domain = "node1.example.com"}]
EOD
} }
variable "controller_macs" { variable "workers" {
type = list(string) type = list(object({
description = "Ordered list of controller identifying MAC addresses (e.g. [52:54:00:a1:9c:ae])" name = string
} mac = string
domain = string
variable "controller_domains" { }))
type = list(string) description = <<EOD
description = "Ordered list of controller FQDNs (e.g. [node1.example.com])" List of worker machine details (unique name, identifying MAC address, FQDN)
} [
{ name = "node2", mac = "52:54:00:b2:2f:86", domain = "node2.example.com"},
variable "worker_names" { { name = "node3", mac = "52:54:00:c3:61:77", domain = "node3.example.com"}
type = list(string) ]
description = "Ordered list of worker names (e.g. [node2, node3])" EOD
}
variable "worker_macs" {
type = list(string)
description = "Ordered list of worker identifying MAC addresses (e.g. [52:54:00:b2:2f:86, 52:54:00:c3:61:77])"
}
variable "worker_domains" {
type = list(string)
description = "Ordered list of worker FQDNs (e.g. [node2.example.com, node3.example.com])"
} }
variable "clc_snippets" { variable "clc_snippets" {
@ -62,8 +58,8 @@ variable "clc_snippets" {
# configuration # configuration
variable "k8s_domain_name" { variable "k8s_domain_name" {
description = "Controller DNS name which resolves to a controller instance. Workers and kubeconfig's will communicate with this endpoint (e.g. cluster.example.com)"
type = string type = string
description = "Controller DNS name which resolves to a controller instance. Workers and kubeconfig's will communicate with this endpoint (e.g. cluster.example.com)"
} }
variable "ssh_authorized_key" { variable "ssh_authorized_key" {
@ -72,63 +68,55 @@ variable "ssh_authorized_key" {
} }
variable "asset_dir" { variable "asset_dir" {
description = "Path to a directory where generated assets should be placed (contains secrets)"
type = string type = string
description = "Absolute path to a directory where generated assets should be placed (contains secrets)"
} }
variable "networking" { variable "networking" {
description = "Choice of networking provider (flannel or calico)"
type = string type = string
description = "Choice of networking provider (flannel or calico)"
default = "calico" default = "calico"
} }
variable "network_mtu" { variable "network_mtu" {
type = number
description = "CNI interface MTU (applies to calico only)" description = "CNI interface MTU (applies to calico only)"
type = string default = 1480
default = "1480"
} }
variable "network_ip_autodetection_method" { variable "network_ip_autodetection_method" {
description = "Method to autodetect the host IPv4 address (applies to calico only)"
type = string type = string
description = "Method to autodetect the host IPv4 address (applies to calico only)"
default = "first-found" default = "first-found"
} }
variable "pod_cidr" { variable "pod_cidr" {
description = "CIDR IPv4 range to assign Kubernetes pods"
type = string type = string
description = "CIDR IPv4 range to assign Kubernetes pods"
default = "10.2.0.0/16" default = "10.2.0.0/16"
} }
variable "service_cidr" { variable "service_cidr" {
type = string
description = <<EOD description = <<EOD
CIDR IPv4 range to assign Kubernetes services. CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns. The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns.
EOD EOD
type = string
default = "10.3.0.0/16" default = "10.3.0.0/16"
} }
# optional # optional
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = string
default = "cluster.local"
}
variable "download_protocol" { variable "download_protocol" {
type = string type = string
default = "https"
description = "Protocol iPXE should use to download the kernel and initrd. Defaults to https, which requires iPXE compiled with crypto support. Unused if cached_install is true." description = "Protocol iPXE should use to download the kernel and initrd. Defaults to https, which requires iPXE compiled with crypto support. Unused if cached_install is true."
default = "https"
} }
variable "cached_install" { variable "cached_install" {
type = string type = bool
default = "false"
description = "Whether Container Linux should PXE boot and install from matchbox /assets cache. Note that the admin must have downloaded the os_version into matchbox assets." description = "Whether Container Linux should PXE boot and install from matchbox /assets cache. Note that the admin must have downloaded the os_version into matchbox assets."
default = false
} }
variable "install_disk" { variable "install_disk" {
@ -137,27 +125,29 @@ variable "install_disk" {
description = "Disk device to which the install profiles should install Container Linux (e.g. /dev/sda)" description = "Disk device to which the install profiles should install Container Linux (e.g. /dev/sda)"
} }
variable "container_linux_oem" {
type = string
default = ""
description = "DEPRECATED: Specify an OEM image id to use as base for the installation (e.g. ami, vmware_raw, xen) or leave blank for the default image"
}
variable "kernel_args" { variable "kernel_args" {
description = "Additional kernel arguments to provide at PXE boot."
type = list(string) type = list(string)
description = "Additional kernel arguments to provide at PXE boot."
default = [] default = []
} }
variable "enable_reporting" { variable "enable_reporting" {
type = string type = bool
description = "Enable usage or analytics reporting to upstreams (Calico)" description = "Enable usage or analytics reporting to upstreams (Calico)"
default = "false" default = false
} }
variable "enable_aggregation" { variable "enable_aggregation" {
type = bool
description = "Enable the Kubernetes Aggregation Layer (defaults to false)" description = "Enable the Kubernetes Aggregation Layer (defaults to false)"
type = string default = false
default = "false" }
# unofficial, undocumented, unsupported
variable "cluster_domain_suffix" {
type = string
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
default = "cluster.local"
} }

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a> ## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.16.0 (upstream) * Kubernetes v1.16.3 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking * Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization * Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization

View File

@ -1,10 +1,10 @@
# Kubernetes assets (kubeconfig, manifests) # Kubernetes assets (kubeconfig, manifests)
module "bootstrap" { module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=539b725093c8cd94ba46603adb25ac5280562ec8" source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=0daa1276c633fea28e41b2c2c18831e2584deb24"
cluster_name = var.cluster_name cluster_name = var.cluster_name
api_servers = [var.k8s_domain_name] api_servers = [var.k8s_domain_name]
etcd_servers = var.controller_domains etcd_servers = var.controllers.*.domain
asset_dir = var.asset_dir asset_dir = var.asset_dir
networking = var.networking networking = var.networking
network_mtu = var.network_mtu network_mtu = var.network_mtu

View File

@ -28,7 +28,7 @@ systemd:
--network host \ --network host \
--volume /var/lib/etcd:/var/lib/etcd:rw,Z \ --volume /var/lib/etcd:/var/lib/etcd:rw,Z \
--volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \ --volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \
quay.io/coreos/etcd:v3.4.0 quay.io/coreos/etcd:v3.4.3
ExecStop=/usr/bin/podman stop etcd ExecStop=/usr/bin/podman stop etcd
[Install] [Install]
WantedBy=multi-user.target WantedBy=multi-user.target
@ -81,7 +81,7 @@ systemd:
--volume /opt/cni/bin:/opt/cni/bin:z \ --volume /opt/cni/bin:/opt/cni/bin:z \
--volume /etc/iscsi:/etc/iscsi \ --volume /etc/iscsi:/etc/iscsi \
--volume /sbin/iscsiadm:/sbin/iscsiadm \ --volume /sbin/iscsiadm:/sbin/iscsiadm \
k8s.gcr.io/hyperkube:v1.16.0 /hyperkube kubelet \ k8s.gcr.io/hyperkube:v1.16.3 /hyperkube kubelet \
--anonymous-auth=false \ --anonymous-auth=false \
--authentication-token-webhook \ --authentication-token-webhook \
--authorization-mode=Webhook \ --authorization-mode=Webhook \
@ -132,7 +132,7 @@ systemd:
--network host \ --network host \
--volume /opt/bootstrap/assets:/assets:ro,Z \ --volume /opt/bootstrap/assets:/assets:ro,Z \
--volume /opt/bootstrap/apply:/apply:ro,Z \ --volume /opt/bootstrap/apply:/apply:ro,Z \
k8s.gcr.io/hyperkube:v1.16.0 \ k8s.gcr.io/hyperkube:v1.16.3 \
/apply /apply
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
ExecStartPost=-/usr/bin/podman stop bootstrap ExecStartPost=-/usr/bin/podman stop bootstrap

View File

@ -51,7 +51,7 @@ systemd:
--volume /opt/cni/bin:/opt/cni/bin:z \ --volume /opt/cni/bin:/opt/cni/bin:z \
--volume /etc/iscsi:/etc/iscsi \ --volume /etc/iscsi:/etc/iscsi \
--volume /sbin/iscsiadm:/sbin/iscsiadm \ --volume /sbin/iscsiadm:/sbin/iscsiadm \
k8s.gcr.io/hyperkube:v1.16.0 /hyperkube kubelet \ k8s.gcr.io/hyperkube:v1.16.3 /hyperkube kubelet \
--anonymous-auth=false \ --anonymous-auth=false \
--authentication-token-webhook \ --authentication-token-webhook \
--authorization-mode=Webhook \ --authorization-mode=Webhook \

View File

@ -1,22 +1,22 @@
# Match each controller or worker to a profile # Match each controller or worker to a profile
resource "matchbox_group" "controller" { resource "matchbox_group" "controller" {
count = length(var.controller_names) count = length(var.controllers)
name = format("%s-%s", var.cluster_name, var.controller_names[count.index]) name = format("%s-%s", var.cluster_name, var.controllers.*.name[count.index])
profile = matchbox_profile.controllers.*.name[count.index] profile = matchbox_profile.controllers.*.name[count.index]
selector = { selector = {
mac = var.controller_macs[count.index] mac = var.controllers.*.mac[count.index]
} }
} }
resource "matchbox_group" "worker" { resource "matchbox_group" "worker" {
count = length(var.worker_names) count = length(var.workers)
name = format("%s-%s", var.cluster_name, var.worker_names[count.index]) name = format("%s-%s", var.cluster_name, var.workers.*.name[count.index])
profile = matchbox_profile.workers.*.name[count.index] profile = matchbox_profile.workers.*.name[count.index]
selector = { selector = {
mac = var.worker_macs[count.index] mac = var.workers.*.mac[count.index]
} }
} }

View File

@ -1,36 +1,36 @@
locals { locals {
remote_kernel = "https://builds.coreos.fedoraproject.org/prod/streams/${var.os_stream}/builds/${var.os_version}/x86_64/fedora-coreos-${var.os_version}-installer-kernel" remote_kernel = "https://builds.coreos.fedoraproject.org/prod/streams/${var.os_stream}/builds/${var.os_version}/x86_64/fedora-coreos-${var.os_version}-installer-kernel-x86_64"
remote_initrd = "https://builds.coreos.fedoraproject.org/prod/streams/${var.os_stream}/builds/${var.os_version}/x86_64/fedora-coreos-${var.os_version}-installer-initramfs.img" remote_initrd = "https://builds.coreos.fedoraproject.org/prod/streams/${var.os_stream}/builds/${var.os_version}/x86_64/fedora-coreos-${var.os_version}-installer-initramfs.x86_64.img"
remote_args = [ remote_args = [
"ip=dhcp", "ip=dhcp",
"rd.neednet=1", "rd.neednet=1",
"coreos.inst=yes", "coreos.inst=yes",
"coreos.inst.image_url=https://builds.coreos.fedoraproject.org/prod/streams/${var.os_stream}/builds/${var.os_version}/x86_64/fedora-coreos-${var.os_version}-metal.raw.xz", "coreos.inst.image_url=https://builds.coreos.fedoraproject.org/prod/streams/${var.os_stream}/builds/${var.os_version}/x86_64/fedora-coreos-${var.os_version}-metal.x86_64.raw.xz",
"coreos.inst.ignition_url=${var.matchbox_http_endpoint}/ignition?uuid=$${uuid}&mac=$${mac:hexhyp}", "coreos.inst.ignition_url=${var.matchbox_http_endpoint}/ignition?uuid=$${uuid}&mac=$${mac:hexhyp}",
"coreos.inst.install_dev=${var.install_disk}" "coreos.inst.install_dev=${var.install_disk}"
] ]
cached_kernel = "/assets/fedora-coreos/fedora-coreos-${var.os_version}-installer-kernel" cached_kernel = "/assets/fedora-coreos/fedora-coreos-${var.os_version}-installer-kernel-x86_64"
cached_initrd = "/assets/fedora-coreos/fedora-coreos-${var.os_version}-installer-initramfs.img" cached_initrd = "/assets/fedora-coreos/fedora-coreos-${var.os_version}-installer-initramfs.x86_64.img"
cached_args = [ cached_args = [
"ip=dhcp", "ip=dhcp",
"rd.neednet=1", "rd.neednet=1",
"coreos.inst=yes", "coreos.inst=yes",
"coreos.inst.image_url=${var.matchbox_http_endpoint}/assets/fedora-coreos/fedora-coreos-${var.os_version}-metal.raw.xz", "coreos.inst.image_url=${var.matchbox_http_endpoint}/assets/fedora-coreos/fedora-coreos-${var.os_version}-metal.x86_64.raw.xz",
"coreos.inst.ignition_url=${var.matchbox_http_endpoint}/ignition?uuid=$${uuid}&mac=$${mac:hexhyp}", "coreos.inst.ignition_url=${var.matchbox_http_endpoint}/ignition?uuid=$${uuid}&mac=$${mac:hexhyp}",
"coreos.inst.install_dev=${var.install_disk}" "coreos.inst.install_dev=${var.install_disk}"
] ]
kernel = var.cached_install == "true" ? local.cached_kernel : local.remote_kernel kernel = var.cached_install ? local.cached_kernel : local.remote_kernel
initrd = var.cached_install == "true" ? local.cached_initrd : local.remote_initrd initrd = var.cached_install ? local.cached_initrd : local.remote_initrd
args = var.cached_install == "true" ? local.cached_args : local.remote_args args = var.cached_install ? local.cached_args : local.remote_args
} }
// Fedora CoreOS controller profile // Fedora CoreOS controller profile
resource "matchbox_profile" "controllers" { resource "matchbox_profile" "controllers" {
count = length(var.controller_names) count = length(var.controllers)
name = format("%s-controller-%s", var.cluster_name, var.controller_names[count.index]) name = format("%s-controller-%s", var.cluster_name, var.controllers.*.name[count.index])
kernel = local.kernel kernel = local.kernel
initrd = [ initrd = [
@ -42,20 +42,20 @@ resource "matchbox_profile" "controllers" {
} }
data "ct_config" "controller-ignitions" { data "ct_config" "controller-ignitions" {
count = length(var.controller_names) count = length(var.controllers)
content = data.template_file.controller-configs.*.rendered[count.index] content = data.template_file.controller-configs.*.rendered[count.index]
strict = true strict = true
} }
data "template_file" "controller-configs" { data "template_file" "controller-configs" {
count = length(var.controller_names) count = length(var.controllers)
template = file("${path.module}/fcc/controller.yaml") template = file("${path.module}/fcc/controller.yaml")
vars = { vars = {
domain_name = var.controller_domains[count.index] domain_name = var.controllers.*.domain[count.index]
etcd_name = var.controller_names[count.index] etcd_name = var.controllers.*.name[count.index]
etcd_initial_cluster = join(",", formatlist("%s=https://%s:2380", var.controller_names, var.controller_domains)) etcd_initial_cluster = join(",", formatlist("%s=https://%s:2380", var.controllers.*.name, var.controllers.*.domain))
cluster_dns_service_ip = module.bootstrap.cluster_dns_service_ip cluster_dns_service_ip = module.bootstrap.cluster_dns_service_ip
cluster_domain_suffix = var.cluster_domain_suffix cluster_domain_suffix = var.cluster_domain_suffix
ssh_authorized_key = var.ssh_authorized_key ssh_authorized_key = var.ssh_authorized_key
@ -64,8 +64,8 @@ data "template_file" "controller-configs" {
// Fedora CoreOS worker profile // Fedora CoreOS worker profile
resource "matchbox_profile" "workers" { resource "matchbox_profile" "workers" {
count = length(var.worker_names) count = length(var.workers)
name = format("%s-worker-%s", var.cluster_name, var.worker_names[count.index]) name = format("%s-worker-%s", var.cluster_name, var.workers.*.name[count.index])
kernel = local.kernel kernel = local.kernel
initrd = [ initrd = [
@ -77,18 +77,18 @@ resource "matchbox_profile" "workers" {
} }
data "ct_config" "worker-ignitions" { data "ct_config" "worker-ignitions" {
count = length(var.worker_names) count = length(var.workers)
content = data.template_file.worker-configs.*.rendered[count.index] content = data.template_file.worker-configs.*.rendered[count.index]
strict = true strict = true
} }
data "template_file" "worker-configs" { data "template_file" "worker-configs" {
count = length(var.worker_names) count = length(var.workers)
template = file("${path.module}/fcc/worker.yaml") template = file("${path.module}/fcc/worker.yaml")
vars = { vars = {
domain_name = var.worker_domains[count.index] domain_name = var.workers.*.domain[count.index]
cluster_dns_service_ip = module.bootstrap.cluster_dns_service_ip cluster_dns_service_ip = module.bootstrap.cluster_dns_service_ip
cluster_domain_suffix = var.cluster_domain_suffix cluster_domain_suffix = var.cluster_domain_suffix
ssh_authorized_key = var.ssh_authorized_key ssh_authorized_key = var.ssh_authorized_key

View File

@ -1,6 +1,6 @@
# Secure copy assets to controllers. Activates kubelet.service # Secure copy assets to controllers. Activates kubelet.service
resource "null_resource" "copy-controller-secrets" { resource "null_resource" "copy-controller-secrets" {
count = length(var.controller_names) count = length(var.controllers)
# Without depends_on, remote-exec could start and wait for machines before # Without depends_on, remote-exec could start and wait for machines before
# matchbox groups are written, causing a deadlock. # matchbox groups are written, causing a deadlock.
@ -12,7 +12,7 @@ resource "null_resource" "copy-controller-secrets" {
connection { connection {
type = "ssh" type = "ssh"
host = var.controller_domains[count.index] host = var.controllers.*.domain[count.index]
user = "core" user = "core"
timeout = "60m" timeout = "60m"
} }
@ -73,7 +73,7 @@ resource "null_resource" "copy-controller-secrets" {
"sudo mv etcd-peer.crt /etc/ssl/etcd/etcd/peer.crt", "sudo mv etcd-peer.crt /etc/ssl/etcd/etcd/peer.crt",
"sudo mv etcd-peer.key /etc/ssl/etcd/etcd/peer.key", "sudo mv etcd-peer.key /etc/ssl/etcd/etcd/peer.key",
"sudo mv $HOME/assets /opt/bootstrap/assets", "sudo mv $HOME/assets /opt/bootstrap/assets",
"sudo mkdir -p /etc/kubernetes/manifests" "sudo mkdir -p /etc/kubernetes/manifests",
"sudo mkdir -p /etc/kubernetes/bootstrap-secrets", "sudo mkdir -p /etc/kubernetes/bootstrap-secrets",
"sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig", "sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
"sudo cp -r /opt/bootstrap/assets/tls/* /etc/kubernetes/bootstrap-secrets/", "sudo cp -r /opt/bootstrap/assets/tls/* /etc/kubernetes/bootstrap-secrets/",
@ -85,7 +85,7 @@ resource "null_resource" "copy-controller-secrets" {
# Secure copy kubeconfig to all workers. Activates kubelet.service # Secure copy kubeconfig to all workers. Activates kubelet.service
resource "null_resource" "copy-worker-secrets" { resource "null_resource" "copy-worker-secrets" {
count = length(var.worker_names) count = length(var.workers)
# Without depends_on, remote-exec could start and wait for machines before # Without depends_on, remote-exec could start and wait for machines before
# matchbox groups are written, causing a deadlock. # matchbox groups are written, causing a deadlock.
@ -96,7 +96,7 @@ resource "null_resource" "copy-worker-secrets" {
connection { connection {
type = "ssh" type = "ssh"
host = var.worker_domains[count.index] host = var.workers.*.domain[count.index]
user = "core" user = "core"
timeout = "60m" timeout = "60m"
} }
@ -125,7 +125,7 @@ resource "null_resource" "bootstrap" {
connection { connection {
type = "ssh" type = "ssh"
host = var.controller_domains[0] host = var.controllers[0].domain
user = "core" user = "core"
timeout = "15m" timeout = "15m"
} }

View File

@ -22,36 +22,32 @@ variable "os_version" {
} }
# machines # machines
# Terraform's crude "type system" does not properly support lists of maps so we do this.
variable "controller_names" { variable "controllers" {
type = list(string) type = list(object({
description = "Ordered list of controller names (e.g. [node1])" name = string
mac = string
domain = string
}))
description = <<EOD
List of controller machine details (unique name, identifying MAC address, FQDN)
[{ name = "node1", mac = "52:54:00:a1:9c:ae", domain = "node1.example.com"}]
EOD
} }
variable "controller_macs" { variable "workers" {
type = list(string) type = list(object({
description = "Ordered list of controller identifying MAC addresses (e.g. [52:54:00:a1:9c:ae])" name = string
} mac = string
domain = string
variable "controller_domains" { }))
type = list(string) description = <<EOD
description = "Ordered list of controller FQDNs (e.g. [node1.example.com])" List of worker machine details (unique name, identifying MAC address, FQDN)
} [
{ name = "node2", mac = "52:54:00:b2:2f:86", domain = "node2.example.com"},
variable "worker_names" { { name = "node3", mac = "52:54:00:c3:61:77", domain = "node3.example.com"}
type = list(string) ]
description = "Ordered list of worker names (e.g. [node2, node3])" EOD
}
variable "worker_macs" {
type = list(string)
description = "Ordered list of worker identifying MAC addresses (e.g. [52:54:00:b2:2f:86, 52:54:00:c3:61:77])"
}
variable "worker_domains" {
type = list(string)
description = "Ordered list of worker FQDNs (e.g. [node2.example.com, node3.example.com])"
} }
variable "snippets" { variable "snippets" {
@ -63,8 +59,8 @@ variable "snippets" {
# configuration # configuration
variable "k8s_domain_name" { variable "k8s_domain_name" {
description = "Controller DNS name which resolves to a controller instance. Workers and kubeconfig's will communicate with this endpoint (e.g. cluster.example.com)"
type = string type = string
description = "Controller DNS name which resolves to a controller instance. Workers and kubeconfig's will communicate with this endpoint (e.g. cluster.example.com)"
} }
variable "ssh_authorized_key" { variable "ssh_authorized_key" {
@ -73,80 +69,80 @@ variable "ssh_authorized_key" {
} }
variable "asset_dir" { variable "asset_dir" {
description = "Path to a directory where generated assets should be placed (contains secrets)"
type = string type = string
description = "Absolute path to a directory where generated assets should be placed (contains secrets)"
} }
variable "networking" { variable "networking" {
description = "Choice of networking provider (flannel or calico)"
type = string type = string
description = "Choice of networking provider (flannel or calico)"
default = "calico" default = "calico"
} }
variable "network_mtu" { variable "network_mtu" {
type = number
description = "CNI interface MTU (applies to calico only)" description = "CNI interface MTU (applies to calico only)"
type = string default = 1480
default = "1480"
} }
variable "network_ip_autodetection_method" { variable "network_ip_autodetection_method" {
description = "Method to autodetect the host IPv4 address (applies to calico only)"
type = string type = string
description = "Method to autodetect the host IPv4 address (applies to calico only)"
default = "first-found" default = "first-found"
} }
variable "pod_cidr" { variable "pod_cidr" {
description = "CIDR IPv4 range to assign Kubernetes pods"
type = string type = string
description = "CIDR IPv4 range to assign Kubernetes pods"
default = "10.2.0.0/16" default = "10.2.0.0/16"
} }
variable "service_cidr" { variable "service_cidr" {
type = string
description = <<EOD description = <<EOD
CIDR IPv4 range to assign Kubernetes services. CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns. The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns.
EOD EOD
type = string
default = "10.3.0.0/16" default = "10.3.0.0/16"
} }
# optional # optional
variable "cached_install" {
type = bool
description = "Whether Fedora CoreOS should PXE boot and install from matchbox /assets cache. Note that the admin must have downloaded the os_version into matchbox assets."
default = false
}
variable "install_disk" {
type = string
description = "Disk device to install Fedora CoreOS (e.g. sda)"
default = "sda"
}
variable "kernel_args" {
type = list(string)
description = "Additional kernel arguments to provide at PXE boot."
default = []
}
variable "enable_reporting" {
type = bool
description = "Enable usage or analytics reporting to upstreams (Calico)"
default = false
}
variable "enable_aggregation" {
type = bool
description = "Enable the Kubernetes Aggregation Layer (defaults to false)"
default = false
}
# unofficial, undocumented, unsupported
variable "cluster_domain_suffix" { variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) " description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = string type = string
default = "cluster.local" default = "cluster.local"
} }
variable "cached_install" {
type = string
default = "false"
description = "Whether Fedora CoreOS should PXE boot and install from matchbox /assets cache. Note that the admin must have downloaded the os_version into matchbox assets."
}
variable "install_disk" {
type = string
default = "sda"
description = "Disk device to install Fedora CoreOS (e.g. sda)"
}
variable "kernel_args" {
description = "Additional kernel arguments to provide at PXE boot."
type = list(string)
default = []
}
variable "enable_reporting" {
type = string
description = "Enable usage or analytics reporting to upstreams (Calico)"
default = "false"
}
variable "enable_aggregation" {
description = "Enable the Kubernetes Aggregation Layer (defaults to false)"
type = string
default = "false"
}

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a> ## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.16.0 (upstream) * Kubernetes v1.16.3 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking * Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization * Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests) # Kubernetes assets (kubeconfig, manifests)
module "bootstrap" { module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=539b725093c8cd94ba46603adb25ac5280562ec8" source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=0daa1276c633fea28e41b2c2c18831e2584deb24"
cluster_name = var.cluster_name cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)] api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]

View File

@ -7,7 +7,7 @@ systemd:
- name: 40-etcd-cluster.conf - name: 40-etcd-cluster.conf
contents: | contents: |
[Service] [Service]
Environment="ETCD_IMAGE_TAG=v3.4.0" Environment="ETCD_IMAGE_TAG=v3.4.3"
Environment="ETCD_NAME=${etcd_name}" Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379" Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380" Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
@ -123,7 +123,7 @@ systemd:
--volume script,kind=host,source=/opt/bootstrap/apply \ --volume script,kind=host,source=/opt/bootstrap/apply \
--mount volume=script,target=/apply \ --mount volume=script,target=/apply \
--insecure-options=image \ --insecure-options=image \
docker://k8s.gcr.io/hyperkube:v1.16.0 \ docker://k8s.gcr.io/hyperkube:v1.16.3 \
--net=host \ --net=host \
--dns=host \ --dns=host \
--exec=/apply --exec=/apply
@ -138,7 +138,7 @@ storage:
contents: contents:
inline: | inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.16.0 KUBELET_IMAGE_TAG=v1.16.3
- path: /opt/bootstrap/apply - path: /opt/bootstrap/apply
filesystem: root filesystem: root
mode: 0544 mode: 0544

View File

@ -99,7 +99,7 @@ storage:
contents: contents:
inline: | inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.16.0 KUBELET_IMAGE_TAG=v1.16.3
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
filesystem: root filesystem: root
contents: contents:
@ -117,7 +117,7 @@ storage:
--volume config,kind=host,source=/etc/kubernetes \ --volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \ --mount volume=config,target=/etc/kubernetes \
--insecure-options=image \ --insecure-options=image \
docker://k8s.gcr.io/hyperkube:v1.16.0 \ docker://k8s.gcr.io/hyperkube:v1.16.3 \
--net=host \ --net=host \
--dns=host \ --dns=host \
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname) --exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)

View File

@ -72,7 +72,7 @@ resource "null_resource" "copy-controller-secrets" {
"sudo chown -R etcd:etcd /etc/ssl/etcd", "sudo chown -R etcd:etcd /etc/ssl/etcd",
"sudo chmod -R 500 /etc/ssl/etcd", "sudo chmod -R 500 /etc/ssl/etcd",
"sudo mv $HOME/assets /opt/bootstrap/assets", "sudo mv $HOME/assets /opt/bootstrap/assets",
"sudo mkdir -p /etc/kubernetes/manifests" "sudo mkdir -p /etc/kubernetes/manifests",
"sudo mkdir -p /etc/kubernetes/bootstrap-secrets", "sudo mkdir -p /etc/kubernetes/bootstrap-secrets",
"sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig", "sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
"sudo cp -r /opt/bootstrap/assets/tls/* /etc/kubernetes/bootstrap-secrets/", "sudo cp -r /opt/bootstrap/assets/tls/* /etc/kubernetes/bootstrap-secrets/",

View File

@ -18,33 +18,33 @@ variable "dns_zone" {
# instances # instances
variable "controller_count" { variable "controller_count" {
type = string type = number
default = "1"
description = "Number of controllers (i.e. masters)" description = "Number of controllers (i.e. masters)"
default = 1
} }
variable "worker_count" { variable "worker_count" {
type = string type = number
default = "1"
description = "Number of workers" description = "Number of workers"
default = 1
} }
variable "controller_type" { variable "controller_type" {
type = string type = string
default = "s-2vcpu-2gb"
description = "Droplet type for controllers (e.g. s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb)." description = "Droplet type for controllers (e.g. s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb)."
default = "s-2vcpu-2gb"
} }
variable "worker_type" { variable "worker_type" {
type = string type = string
default = "s-1vcpu-2gb"
description = "Droplet type for workers (e.g. s-1vcpu-2gb, s-2vcpu-2gb)" description = "Droplet type for workers (e.g. s-1vcpu-2gb, s-2vcpu-2gb)"
default = "s-1vcpu-2gb"
} }
variable "image" { variable "image" {
type = string type = string
default = "coreos-stable"
description = "Container Linux image for instances (e.g. coreos-stable)" description = "Container Linux image for instances (e.g. coreos-stable)"
default = "coreos-stable"
} }
variable "controller_clc_snippets" { variable "controller_clc_snippets" {
@ -67,48 +67,48 @@ variable "ssh_fingerprints" {
} }
variable "asset_dir" { variable "asset_dir" {
description = "Path to a directory where generated assets should be placed (contains secrets)"
type = string type = string
description = "Absolute path to a directory where generated assets should be placed (contains secrets)"
} }
variable "networking" { variable "networking" {
description = "Choice of networking provider (flannel or calico)"
type = string type = string
default = "flannel" description = "Choice of networking provider (flannel or calico)"
default = "calico"
} }
variable "pod_cidr" { variable "pod_cidr" {
description = "CIDR IPv4 range to assign Kubernetes pods"
type = string type = string
description = "CIDR IPv4 range to assign Kubernetes pods"
default = "10.2.0.0/16" default = "10.2.0.0/16"
} }
variable "service_cidr" { variable "service_cidr" {
type = string
description = <<EOD description = <<EOD
CIDR IPv4 range to assign Kubernetes services. CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns. The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns.
EOD EOD
type = string
default = "10.3.0.0/16" default = "10.3.0.0/16"
} }
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = string
default = "cluster.local"
}
variable "enable_reporting" { variable "enable_reporting" {
type = string type = bool
description = "Enable usage or analytics reporting to upstreams (Calico)" description = "Enable usage or analytics reporting to upstreams (Calico)"
default = "false" default = false
} }
variable "enable_aggregation" { variable "enable_aggregation" {
type = bool
description = "Enable the Kubernetes Aggregation Layer (defaults to false)" description = "Enable the Kubernetes Aggregation Layer (defaults to false)"
type = string default = false
default = "false" }
# unofficial, undocumented, unsupported
variable "cluster_domain_suffix" {
type = string
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
default = "cluster.local"
} }

View File

@ -62,11 +62,14 @@ The AWS internal `workers` module supports a number of [variables](https://githu
|:-----|:------------|:--------|:--------| |:-----|:------------|:--------|:--------|
| worker_count | Number of instances | 1 | 3 | | worker_count | Number of instances | 1 | 3 |
| instance_type | EC2 instance type | "t3.small" | "t3.medium" | | instance_type | EC2 instance type | "t3.small" | "t3.medium" |
| os_image | AMI channel for a Container Linux derivative | coreos-stable | coreos-stable, coreos-beta, coreos-alpha, flatcar-stable, flatcar-beta, flatcar-alpha | | os_image | AMI channel for a Container Linux derivative | "coreos-stable" | coreos-stable, coreos-beta, coreos-alpha, flatcar-stable, flatcar-beta, flatcar-alpha |
| disk_size | Size of the disk in GB | 40 | 100 | | disk_size | Size of the EBS volume in GB | 40 | 100 |
| spot_price | Spot price in USD for workers. Leave as default empty string for regular on-demand instances | "" | "0.10" | | disk_type | Type of the EBS volume | "gp2" | standard, gp2, io1 |
| disk_iops | IOPS of the EBS volume | 0 (i.e. auto) | 400 |
| spot_price | Spot price in USD for worker instances or 0 to use on-demand instances | 0 | 0.10 |
| clc_snippets | Container Linux Config snippets | [] | [example](/advanced/customization/#usage) |
| service_cidr | Must match `service_cidr` of cluster | "10.3.0.0/16" | "10.3.0.0/24" | | service_cidr | Must match `service_cidr` of cluster | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | Must match `cluster_domain_suffix` of cluster | "cluster.local" | "k8s.example.com" | | node_labels | List of initial node labels | [] | ["worker-pool=foo"] |
Check the list of valid [instance types](https://aws.amazon.com/ec2/instance-types/) or per-region and per-type [spot prices](https://aws.amazon.com/ec2/spot/pricing/). Check the list of valid [instance types](https://aws.amazon.com/ec2/instance-types/) or per-region and per-type [spot prices](https://aws.amazon.com/ec2/spot/pricing/).
@ -76,7 +79,7 @@ Create a cluster following the Azure [tutorial](../cl/azure.md#cluster). Define
```tf ```tf
module "ramius-worker-pool" { module "ramius-worker-pool" {
source = "git::https://github.com/poseidon/typhoon//azure/container-linux/kubernetes/workers?ref=v1.16.0" source = "git::https://github.com/poseidon/typhoon//azure/container-linux/kubernetes/workers?ref=v1.16.3"
# Azure # Azure
region = module.azure-ramius.region region = module.azure-ramius.region
@ -127,12 +130,12 @@ The Azure internal `workers` module supports a number of [variables](https://git
| Name | Description | Default | Example | | Name | Description | Default | Example |
|:-----|:------------|:--------|:--------| |:-----|:------------|:--------|:--------|
| worker_count | Number of instances | 1 | 3 | | worker_count | Number of instances | 1 | 3 |
| vm_type | Machine type for instances | "Standard_F1" | See below | | vm_type | Machine type for instances | "Standard_DS1_v2" | See below |
| os_image | Channel for a Container Linux derivative | coreos-stable | coreos-stable, coreos-beta, coreos-alpha | | os_image | Channel for a Container Linux derivative | "coreos-stable" | coreos-stable, coreos-beta, coreos-alpha |
| priority | Set priority to Low to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time | Regular | Low | | priority | Set priority to Low to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time | "Regular" | "Low" |
| clc_snippets | Container Linux Config snippets | [] | [example](/advanced/customization/#usage) | | clc_snippets | Container Linux Config snippets | [] | [example](/advanced/customization/#usage) |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" | | service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by coredns. | "cluster.local" | "k8s.example.com" | | node_labels | List of initial node labels | [] | ["worker-pool=foo"] |
Check the list of valid [machine types](https://azure.microsoft.com/en-us/pricing/details/virtual-machines/linux/) and their [specs](https://docs.microsoft.com/en-us/azure/virtual-machines/linux/sizes-general). Use `az vm list-skus` to get the identifier. Check the list of valid [machine types](https://azure.microsoft.com/en-us/pricing/details/virtual-machines/linux/) and their [specs](https://docs.microsoft.com/en-us/azure/virtual-machines/linux/sizes-general). Use `az vm list-skus` to get the identifier.
@ -142,7 +145,7 @@ Create a cluster following the Google Cloud [tutorial](../cl/google-cloud.md#clu
```tf ```tf
module "yavin-worker-pool" { module "yavin-worker-pool" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes/workers?ref=v1.16.0" source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes/workers?ref=v1.16.3"
# Google Cloud # Google Cloud
region = "europe-west2" region = "europe-west2"
@ -173,11 +176,11 @@ Verify a managed instance group of workers joins the cluster within a few minute
``` ```
$ kubectl get nodes $ kubectl get nodes
NAME STATUS AGE VERSION NAME STATUS AGE VERSION
yavin-controller-0.c.example-com.internal Ready 6m v1.16.0 yavin-controller-0.c.example-com.internal Ready 6m v1.16.3
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.16.0 yavin-worker-jrbf.c.example-com.internal Ready 5m v1.16.3
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.16.0 yavin-worker-mzdm.c.example-com.internal Ready 5m v1.16.3
yavin-16x-worker-jrbf.c.example-com.internal Ready 3m v1.16.0 yavin-16x-worker-jrbf.c.example-com.internal Ready 3m v1.16.3
yavin-16x-worker-mzdm.c.example-com.internal Ready 3m v1.16.0 yavin-16x-worker-mzdm.c.example-com.internal Ready 3m v1.16.3
``` ```
### Variables ### Variables
@ -189,9 +192,9 @@ The Google Cloud internal `workers` module supports a number of [variables](http
| Name | Description | Example | | Name | Description | Example |
|:-----|:------------|:--------| |:-----|:------------|:--------|
| name | Unique name (distinct from cluster name) | "yavin-16x" | | name | Unique name (distinct from cluster name) | "yavin-16x" |
| cluster_name | Must be set to `cluster_name` of cluster | "yavin" |
| region | Region for the worker pool instances. May differ from the cluster's region | "europe-west2" | | region | Region for the worker pool instances. May differ from the cluster's region | "europe-west2" |
| network | Must be set to `network_name` output by cluster | module.cluster.network_name | | network | Must be set to `network_name` output by cluster | module.cluster.network_name |
| cluster_name | Must be set to `cluster_name` of cluster | "yavin" |
| kubeconfig | Must be set to `kubeconfig` output by cluster | module.cluster.kubeconfig | | kubeconfig | Must be set to `kubeconfig` output by cluster | module.cluster.kubeconfig |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." | | ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." |
@ -206,8 +209,9 @@ Check the list of regions [docs](https://cloud.google.com/compute/docs/regions-z
| os_image | Container Linux image for compute instances | "coreos-stable" | "coreos-alpha", "coreos-beta" | | os_image | Container Linux image for compute instances | "coreos-stable" | "coreos-alpha", "coreos-beta" |
| disk_size | Size of the disk in GB | 40 | 100 | | disk_size | Size of the disk in GB | 40 | 100 |
| preemptible | If true, Compute Engine will terminate instances randomly within 24 hours | false | true | | preemptible | If true, Compute Engine will terminate instances randomly within 24 hours | false | true |
| clc_snippets | Container Linux Config snippets | [] | [example](/advanced/customization/#usage) |
| service_cidr | Must match `service_cidr` of cluster | "10.3.0.0/16" | "10.3.0.0/24" | | service_cidr | Must match `service_cidr` of cluster | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | Must match `cluster_domain_suffix` of cluster | "cluster.local" | "k8s.example.com" | | node_labels | List of initial node labels | [] | ["worker-pool=foo"] |
Check the list of valid [machine types](https://cloud.google.com/compute/docs/machine-types). Check the list of valid [machine types](https://cloud.google.com/compute/docs/machine-types).

View File

@ -1,6 +1,6 @@
# AWS # AWS
In this tutorial, we'll create a Kubernetes v1.16.0 cluster on AWS with Container Linux. In this tutorial, we'll create a Kubernetes v1.16.3 cluster on AWS with Container Linux.
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a VPC, gateway, subnets, security groups, controller instances, worker auto-scaling group, network load balancer, and TLS assets. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a VPC, gateway, subnets, security groups, controller instances, worker auto-scaling group, network load balancer, and TLS assets.
@ -18,7 +18,7 @@ Install [Terraform](https://www.terraform.io/downloads.html) v0.12.x on your sys
```sh ```sh
$ terraform version $ terraform version
Terraform v0.12.7 Terraform v0.12.12
``` ```
Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name. Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
@ -49,7 +49,7 @@ Configure the AWS provider to use your access key credentials in a `providers.tf
```tf ```tf
provider "aws" { provider "aws" {
version = "2.29.0" version = "2.35.0"
region = "eu-central-1" region = "eu-central-1"
shared_credentials_file = "/home/user/.config/aws/credentials" shared_credentials_file = "/home/user/.config/aws/credentials"
} }
@ -70,7 +70,7 @@ Define a Kubernetes cluster using the module `aws/container-linux/kubernetes`.
```tf ```tf
module "tempest" { module "tempest" {
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes?ref=v1.16.0" source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes?ref=v1.16.3"
# AWS # AWS
cluster_name = "tempest" cluster_name = "tempest"
@ -135,9 +135,9 @@ In 4-8 minutes, the Kubernetes cluster will be ready.
$ export KUBECONFIG=/home/user/.secrets/clusters/tempest/auth/kubeconfig $ export KUBECONFIG=/home/user/.secrets/clusters/tempest/auth/kubeconfig
$ kubectl get nodes $ kubectl get nodes
NAME STATUS ROLES AGE VERSION NAME STATUS ROLES AGE VERSION
ip-10-0-3-155 Ready <none> 10m v1.16.0 ip-10-0-3-155 Ready <none> 10m v1.16.3
ip-10-0-26-65 Ready <none> 10m v1.16.0 ip-10-0-26-65 Ready <none> 10m v1.16.3
ip-10-0-41-21 Ready <none> 10m v1.16.0 ip-10-0-41-21 Ready <none> 10m v1.16.3
``` ```
List the pods. List the pods.
@ -177,7 +177,7 @@ Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/aws/con
| dns_zone | AWS Route53 DNS zone | "aws.example.com" | | dns_zone | AWS Route53 DNS zone | "aws.example.com" |
| dns_zone_id | AWS Route53 DNS zone id | "Z3PAABBCFAKEC0" | | dns_zone_id | AWS Route53 DNS zone id | "Z3PAABBCFAKEC0" |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." | | ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." |
| asset_dir | Path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/tempest" | | asset_dir | Absolute path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/tempest" |
#### DNS Zone #### DNS Zone
@ -191,7 +191,7 @@ resource "aws_route53_zone" "zone-for-clusters" {
} }
``` ```
Reference the DNS zone id with `"${aws_route53_zone.zone-for-clusters.zone_id}"`. Reference the DNS zone id with `aws_route53_zone.zone-for-clusters.zone_id`.
!!! tip "" !!! tip ""
If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on Route53 (e.g. aws.mydomain.com) and [update nameservers](http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/SOA-NSrecords.html). If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on Route53 (e.g. aws.mydomain.com) and [update nameservers](http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/SOA-NSrecords.html).
@ -205,11 +205,11 @@ Reference the DNS zone id with `"${aws_route53_zone.zone-for-clusters.zone_id}"`
| controller_type | EC2 instance type for controllers | "t3.small" | See below | | controller_type | EC2 instance type for controllers | "t3.small" | See below |
| worker_type | EC2 instance type for workers | "t3.small" | See below | | worker_type | EC2 instance type for workers | "t3.small" | See below |
| os_image | AMI channel for a Container Linux derivative | coreos-stable | coreos-stable, coreos-beta, coreos-alpha, flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge | | os_image | AMI channel for a Container Linux derivative | coreos-stable | coreos-stable, coreos-beta, coreos-alpha, flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge |
| disk_size | Size of the EBS volume in GB | "40" | "100" | | disk_size | Size of the EBS volume in GB | 40 | 100 |
| disk_type | Type of the EBS volume | "gp2" | standard, gp2, io1 | | disk_type | Type of the EBS volume | "gp2" | standard, gp2, io1 |
| disk_iops | IOPS of the EBS volume | "0" (i.e. auto) | "400" | | disk_iops | IOPS of the EBS volume | 0 (i.e. auto) | 400 |
| worker_target_groups | Target group ARNs to which worker instances should be added | [] | ["${aws_lb_target_group.app.id}"] | | worker_target_groups | Target group ARNs to which worker instances should be added | [] | [aws_lb_target_group.app.id] |
| worker_price | Spot price in USD for workers. Leave as default empty string for regular on-demand instances | "" | "0.10" | | worker_price | Spot price in USD for worker instances or 0 to use on-demand instances | 0/null | 0.10 |
| controller_clc_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/) | | controller_clc_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/) |
| worker_clc_snippets | Worker Container Linux Config snippets | [] | [example](/advanced/customization/) | | worker_clc_snippets | Worker Container Linux Config snippets | [] | [example](/advanced/customization/) |
| networking | Choice of networking provider | "calico" | "calico" or "flannel" | | networking | Choice of networking provider | "calico" | "calico" or "flannel" |
@ -217,7 +217,7 @@ Reference the DNS zone id with `"${aws_route53_zone.zone-for-clusters.zone_id}"`
| host_cidr | CIDR IPv4 range to assign to EC2 instances | "10.0.0.0/16" | "10.1.0.0/16" | | host_cidr | CIDR IPv4 range to assign to EC2 instances | "10.0.0.0/16" | "10.1.0.0/16" |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" | | pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" | | service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by coredns. | "cluster.local" | "k8s.example.com" | | worker_node_labels | List of initial worker node labels | [] | ["worker-pool=default"] |
Check the list of valid [instance types](https://aws.amazon.com/ec2/instance-types/). Check the list of valid [instance types](https://aws.amazon.com/ec2/instance-types/).

View File

@ -3,7 +3,7 @@
!!! danger !!! danger
Typhoon for Azure is alpha. For production, use AWS, Google Cloud, or bare-metal. As Azure matures, check [errata](https://github.com/poseidon/typhoon/wiki/Errata) for known shortcomings. Typhoon for Azure is alpha. For production, use AWS, Google Cloud, or bare-metal. As Azure matures, check [errata](https://github.com/poseidon/typhoon/wiki/Errata) for known shortcomings.
In this tutorial, we'll create a Kubernetes v1.16.0 cluster on Azure with Container Linux. In this tutorial, we'll create a Kubernetes v1.16.3 cluster on Azure with Container Linux.
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a resource group, virtual network, subnets, security groups, controller availability set, worker scale set, load balancer, and TLS assets. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a resource group, virtual network, subnets, security groups, controller availability set, worker scale set, load balancer, and TLS assets.
@ -21,7 +21,7 @@ Install [Terraform](https://www.terraform.io/downloads.html) v0.12.x on your sys
```sh ```sh
$ terraform version $ terraform version
Terraform v0.12.7 Terraform v0.12.12
``` ```
Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name. Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
@ -50,7 +50,7 @@ Configure the Azure provider in a `providers.tf` file.
```tf ```tf
provider "azurerm" { provider "azurerm" {
version = "1.34.0" version = "1.36.1"
} }
provider "ct" { provider "ct" {
@ -66,7 +66,7 @@ Define a Kubernetes cluster using the module `azure/container-linux/kubernetes`.
```tf ```tf
module "ramius" { module "ramius" {
source = "git::https://github.com/poseidon/typhoon//azure/container-linux/kubernetes?ref=v1.16.0" source = "git::https://github.com/poseidon/typhoon//azure/container-linux/kubernetes?ref=v1.16.3"
# Azure # Azure
cluster_name = "ramius" cluster_name = "ramius"
@ -132,9 +132,9 @@ In 4-8 minutes, the Kubernetes cluster will be ready.
$ export KUBECONFIG=/home/user/.secrets/clusters/ramius/auth/kubeconfig $ export KUBECONFIG=/home/user/.secrets/clusters/ramius/auth/kubeconfig
$ kubectl get nodes $ kubectl get nodes
NAME STATUS ROLES AGE VERSION NAME STATUS ROLES AGE VERSION
ramius-controller-0 Ready <none> 24m v1.16.0 ramius-controller-0 Ready <none> 24m v1.16.3
ramius-worker-000001 Ready <none> 25m v1.16.0 ramius-worker-000001 Ready <none> 25m v1.16.3
ramius-worker-000002 Ready <none> 24m v1.16.0 ramius-worker-000002 Ready <none> 24m v1.16.3
``` ```
List the pods. List the pods.
@ -144,9 +144,9 @@ $ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-7c6fbb4f4b-b6qzx 1/1 Running 0 26m kube-system coredns-7c6fbb4f4b-b6qzx 1/1 Running 0 26m
kube-system coredns-7c6fbb4f4b-j2k3d 1/1 Running 0 26m kube-system coredns-7c6fbb4f4b-j2k3d 1/1 Running 0 26m
kube-system flannel-bwf24 2/2 Running 0 26m kube-system calico-node-1m5bf 2/2 Running 0 26m
kube-system flannel-ks5qb 2/2 Running 0 26m kube-system calico-node-7jmr1 2/2 Running 0 26m
kube-system flannel-tq2wg 2/2 Running 0 26m kube-system calico-node-bknc8 2/2 Running 0 26m
kube-system kube-apiserver-ramius-controller-0 1/1 Running 0 26m kube-system kube-apiserver-ramius-controller-0 1/1 Running 0 26m
kube-system kube-controller-manager-ramius-controller-0 1/1 Running 0 26m kube-system kube-controller-manager-ramius-controller-0 1/1 Running 0 26m
kube-system kube-proxy-j4vpq 1/1 Running 0 26m kube-system kube-proxy-j4vpq 1/1 Running 0 26m
@ -175,7 +175,7 @@ Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/azure/c
| dns_zone | Azure DNS zone | "azure.example.com" | | dns_zone | Azure DNS zone | "azure.example.com" |
| dns_zone_group | Resource group where the Azure DNS zone resides | "global" | | dns_zone_group | Resource group where the Azure DNS zone resides | "global" |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." | | ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." |
| asset_dir | Path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/ramius" | | asset_dir | Absolute path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/ramius" |
!!! tip !!! tip
Regions are shown in [docs](https://azure.microsoft.com/en-us/global-infrastructure/regions/) or with `az account list-locations --output table`. Regions are shown in [docs](https://azure.microsoft.com/en-us/global-infrastructure/regions/) or with `az account list-locations --output table`.
@ -195,14 +195,14 @@ resource "azurerm_resource_group" "global" {
# DNS zone for clusters # DNS zone for clusters
resource "azurerm_dns_zone" "clusters" { resource "azurerm_dns_zone" "clusters" {
resource_group_name = "${azurerm_resource_group.global.name}" resource_group_name = azurerm_resource_group.global.name
name = "azure.example.com" name = "azure.example.com"
zone_type = "Public" zone_type = "Public"
} }
``` ```
Reference the DNS zone with `"${azurerm_dns_zone.clusters.name}"` and its resource group with `"${azurerm_resource_group.global.name}"`. Reference the DNS zone with `azurerm_dns_zone.clusters.name` and its resource group with `"azurerm_resource_group.global.name`.
!!! tip "" !!! tip ""
If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on Azure DNS (e.g. azure.mydomain.com) and [update nameservers](https://docs.microsoft.com/en-us/azure/dns/dns-delegate-domain-azure-dns). If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on Azure DNS (e.g. azure.mydomain.com) and [update nameservers](https://docs.microsoft.com/en-us/azure/dns/dns-delegate-domain-azure-dns).
@ -213,18 +213,18 @@ Reference the DNS zone with `"${azurerm_dns_zone.clusters.name}"` and its resour
|:-----|:------------|:--------|:--------| |:-----|:------------|:--------|:--------|
| controller_count | Number of controllers (i.e. masters) | 1 | 1 | | controller_count | Number of controllers (i.e. masters) | 1 | 1 |
| worker_count | Number of workers | 1 | 3 | | worker_count | Number of workers | 1 | 3 |
| controller_type | Machine type for controllers | "Standard_DS1_v2" | See below | | controller_type | Machine type for controllers | "Standard_B2s" | See below |
| worker_type | Machine type for workers | "Standard_F1" | See below | | worker_type | Machine type for workers | "Standard_DS1_v2" | See below |
| os_image | Channel for a Container Linux derivative | coreos-stable | coreos-stable, coreos-beta, coreos-alpha | | os_image | Channel for a Container Linux derivative | "coreos-stable" | coreos-stable, coreos-beta, coreos-alpha |
| disk_size | Size of the disk in GB | "40" | "100" | | disk_size | Size of the disk in GB | 40 | 100 |
| worker_priority | Set priority to Low to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time | Regular | Low | | worker_priority | Set priority to Low to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time | Regular | Low |
| controller_clc_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/#usage) | | controller_clc_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/#usage) |
| worker_clc_snippets | Worker Container Linux Config snippets | [] | [example](/advanced/customization/#usage) | | worker_clc_snippets | Worker Container Linux Config snippets | [] | [example](/advanced/customization/#usage) |
| networking | Choice of networking provider | "flannel" | "flannel" or "calico" (experimental) | | networking | Choice of networking provider | "calico" | "flannel" or "calico" |
| host_cidr | CIDR IPv4 range to assign to instances | "10.0.0.0/16" | "10.0.0.0/20" | | host_cidr | CIDR IPv4 range to assign to instances | "10.0.0.0/16" | "10.0.0.0/20" |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" | | pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" | | service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by coredns. | "cluster.local" | "k8s.example.com" | | worker_node_labels | List of initial worker node labels | [] | ["worker-pool=default"] |
Check the list of valid [machine types](https://azure.microsoft.com/en-us/pricing/details/virtual-machines/linux/) and their [specs](https://docs.microsoft.com/en-us/azure/virtual-machines/linux/sizes-general). Use `az vm list-skus` to get the identifier. Check the list of valid [machine types](https://azure.microsoft.com/en-us/pricing/details/virtual-machines/linux/) and their [specs](https://docs.microsoft.com/en-us/azure/virtual-machines/linux/sizes-general). Use `az vm list-skus` to get the identifier.
@ -232,7 +232,7 @@ Check the list of valid [machine types](https://azure.microsoft.com/en-us/pricin
Unlike AWS and GCP, Azure requires its *virtual* networks to have non-overlapping IPv4 CIDRs (yeah, go figure). Instead of each cluster just using `10.0.0.0/16` for instances, each Azure cluster's `host_cidr` must be non-overlapping (e.g. 10.0.0.0/20 for the 1st cluster, 10.0.16.0/20 for the 2nd cluster, etc). Unlike AWS and GCP, Azure requires its *virtual* networks to have non-overlapping IPv4 CIDRs (yeah, go figure). Instead of each cluster just using `10.0.0.0/16` for instances, each Azure cluster's `host_cidr` must be non-overlapping (e.g. 10.0.0.0/20 for the 1st cluster, 10.0.16.0/20 for the 2nd cluster, etc).
!!! warning !!! warning
Do not choose a `controller_type` smaller than `Standard_DS1_v2`. Smaller instances are not sufficient for running a controller. Do not choose a `controller_type` smaller than `Standard_B2s`. Smaller instances are not sufficient for running a controller.
#### Low Priority #### Low Priority

View File

@ -1,6 +1,6 @@
# Bare-Metal # Bare-Metal
In this tutorial, we'll network boot and provision a Kubernetes v1.16.0 cluster on bare-metal with Container Linux. In this tutorial, we'll network boot and provision a Kubernetes v1.16.3 cluster on bare-metal with Container Linux.
First, we'll deploy a [Matchbox](https://github.com/poseidon/matchbox) service and setup a network boot environment. Then, we'll declare a Kubernetes cluster using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Container Linux to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers via Ignition. First, we'll deploy a [Matchbox](https://github.com/poseidon/matchbox) service and setup a network boot environment. Then, we'll declare a Kubernetes cluster using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Container Linux to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers via Ignition.
@ -111,7 +111,7 @@ Install [Terraform](https://www.terraform.io/downloads.html) v0.12.x on your sys
```sh ```sh
$ terraform version $ terraform version
Terraform v0.12.7 Terraform v0.12.12
``` ```
Add the [terraform-provider-matchbox](https://github.com/poseidon/terraform-provider-matchbox) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name. Add the [terraform-provider-matchbox](https://github.com/poseidon/terraform-provider-matchbox) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
@ -144,9 +144,9 @@ Configure the Matchbox provider to use your Matchbox API endpoint and client cer
provider "matchbox" { provider "matchbox" {
version = "0.3.0" version = "0.3.0"
endpoint = "matchbox.example.com:8081" endpoint = "matchbox.example.com:8081"
client_cert = "${file("~/.config/matchbox/client.crt")}" client_cert = file("~/.config/matchbox/client.crt")
client_key = "${file("~/.config/matchbox/client.key")}" client_key = file("~/.config/matchbox/client.key")
ca = "${file("~/.config/matchbox/ca.crt")}" ca = file("~/.config/matchbox/ca.crt")
} }
provider "ct" { provider "ct" {
@ -160,13 +160,13 @@ Define a Kubernetes cluster using the module `bare-metal/container-linux/kuberne
```tf ```tf
module "bare-metal-mercury" { module "bare-metal-mercury" {
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.16.0" source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.16.3"
# bare-metal # bare-metal
cluster_name = "mercury" cluster_name = "mercury"
matchbox_http_endpoint = "http://matchbox.example.com" matchbox_http_endpoint = "http://matchbox.example.com"
os_channel = "coreos-stable" os_channel = "coreos-stable"
os_version = "1632.3.0" os_version = "2191.5.0"
# configuration # configuration
k8s_domain_name = "node1.example.com" k8s_domain_name = "node1.example.com"
@ -174,20 +174,22 @@ module "bare-metal-mercury" {
asset_dir = "/home/user/.secrets/clusters/mercury" asset_dir = "/home/user/.secrets/clusters/mercury"
# machines # machines
controller_names = ["node1"] controllers = [{
controller_macs = ["52:54:00:a1:9c:ae"] name = "node1"
controller_domains = ["node1.example.com"] mac = "52:54:00:a1:9c:ae"
worker_names = [ domain = "node1.example.com"
"node2", }]
"node3", workers = [
] {
worker_macs = [ name = "node2",
"52:54:00:b2:2f:86", mac = "52:54:00:b2:2f:86"
"52:54:00:c3:61:77", domain = "node2.example.com"
] },
worker_domains = [ {
"node2.example.com", name = "node3",
"node3.example.com", mac = "52:54:00:c3:61:77"
domain = "node3.example.com"
}
] ]
# set to http only if you cannot chainload to iPXE firmware with https support # set to http only if you cannot chainload to iPXE firmware with https support
@ -263,9 +265,9 @@ Apply complete! Resources: 55 added, 0 changed, 0 destroyed.
To watch the install to disk (until machines reboot from disk), SSH to port 2222. To watch the install to disk (until machines reboot from disk), SSH to port 2222.
``` ```
# before v1.16.0 # before v1.16.3
$ ssh debug@node1.example.com $ ssh debug@node1.example.com
# after v1.16.0 # after v1.16.3
$ ssh -p 2222 core@node1.example.com $ ssh -p 2222 core@node1.example.com
``` ```
@ -289,9 +291,9 @@ systemd[1]: Started Kubernetes control plane.
$ export KUBECONFIG=/home/user/.secrets/clusters/mercury/auth/kubeconfig $ export KUBECONFIG=/home/user/.secrets/clusters/mercury/auth/kubeconfig
$ kubectl get nodes $ kubectl get nodes
NAME STATUS ROLES AGE VERSION NAME STATUS ROLES AGE VERSION
node1.example.com Ready <none> 10m v1.16.0 node1.example.com Ready <none> 10m v1.16.3
node2.example.com Ready <none> 10m v1.16.0 node2.example.com Ready <none> 10m v1.16.3
node3.example.com Ready <none> 10m v1.16.0 node3.example.com Ready <none> 10m v1.16.3
``` ```
List the pods. List the pods.
@ -327,19 +329,15 @@ Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/bare-me
| Name | Description | Example | | Name | Description | Example |
|:-----|:------------|:--------| |:-----|:------------|:--------|
| cluster_name | Unique cluster name | mercury | | cluster_name | Unique cluster name | "mercury" |
| matchbox_http_endpoint | Matchbox HTTP read-only endpoint | http://matchbox.example.com:port | | matchbox_http_endpoint | Matchbox HTTP read-only endpoint | "http://matchbox.example.com:port" |
| os_channel | Channel for a Container Linux derivative | coreos-stable, coreos-beta, coreos-alpha, flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge | | os_channel | Channel for a Container Linux derivative | coreos-stable, coreos-beta, coreos-alpha, flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge |
| os_version | Version for a Container Linux derivative to PXE and install | 1632.3.0 | | os_version | Version for a Container Linux derivative to PXE and install | "1632.3.0" |
| k8s_domain_name | FQDN resolving to the controller(s) nodes. Workers and kubectl will communicate with this endpoint | "myk8s.example.com" | | k8s_domain_name | FQDN resolving to the controller(s) nodes. Workers and kubectl will communicate with this endpoint | "myk8s.example.com" |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3Nz..." | | ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3Nz..." |
| asset_dir | Path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/mercury" | | asset_dir | Absolute path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/mercury" |
| controller_names | Ordered list of controller short names | ["node1"] | | controllers | List of controller machine detail objects (unique name, identifying MAC address, FQDN) | `[{name="node1", mac="52:54:00:a1:9c:ae", domain="node1.example.com"}]` |
| controller_macs | Ordered list of controller identifying MAC addresses | ["52:54:00:a1:9c:ae"] | | workers | List of worker machine detail objects (unique name, identifying MAC address, FQDN) | `[{name="node2", mac="52:54:00:b2:2f:86", domain="node2.example.com"}, {name="node3", mac="52:54:00:c3:61:77", domain="node3.example.com"}]` |
| controller_domains | Ordered list of controller FQDNs | ["node1.example.com"] |
| worker_names | Ordered list of worker short names | ["node2", "node3"] |
| worker_macs | Ordered list of worker identifying MAC addresses | ["52:54:00:b2:2f:86", "52:54:00:c3:61:77"] |
| worker_domains | Ordered list of worker FQDNs | ["node2.example.com", "node3.example.com"] |
### Optional ### Optional
@ -351,9 +349,8 @@ Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/bare-me
| networking | Choice of networking provider | "calico" | "calico" or "flannel" | | networking | Choice of networking provider | "calico" | "calico" or "flannel" |
| network_mtu | CNI interface MTU (calico-only) | 1480 | - | | network_mtu | CNI interface MTU (calico-only) | 1480 | - |
| clc_snippets | Map from machine names to lists of Container Linux Config snippets | {} | [example](/advanced/customization/#usage) | | clc_snippets | Map from machine names to lists of Container Linux Config snippets | {} | [example](/advanced/customization/#usage) |
| network_ip_autodetection_method | Method to detect host IPv4 address (calico-only) | first-found | can-reach=10.0.0.1 | | network_ip_autodetection_method | Method to detect host IPv4 address (calico-only) | "first-found" | "can-reach=10.0.0.1" |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" | | pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" | | service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by coredns. | "cluster.local" | "k8s.example.com" | | kernel_args | Additional kernel args to provide at PXE boot | [] | ["kvm-intel.nested=1"] |
| kernel_args | Additional kernel args to provide at PXE boot | [] | "kvm-intel.nested=1" |

View File

@ -1,6 +1,6 @@
# Digital Ocean # Digital Ocean
In this tutorial, we'll create a Kubernetes v1.16.0 cluster on DigitalOcean with Container Linux. In this tutorial, we'll create a Kubernetes v1.16.3 cluster on DigitalOcean with Container Linux.
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create controller droplets, worker droplets, DNS records, tags, and TLS assets. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create controller droplets, worker droplets, DNS records, tags, and TLS assets.
@ -18,7 +18,7 @@ Install [Terraform](https://www.terraform.io/downloads.html) v0.12.x on your sys
```sh ```sh
$ terraform version $ terraform version
Terraform v0.12.7 Terraform v0.12.12
``` ```
Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name. Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
@ -50,7 +50,7 @@ Configure the DigitalOcean provider to use your token in a `providers.tf` file.
```tf ```tf
provider "digitalocean" { provider "digitalocean" {
version = "1.7.0" version = "1.11.0"
token = "${chomp(file("~/.config/digital-ocean/token"))}" token = "${chomp(file("~/.config/digital-ocean/token"))}"
} }
@ -65,7 +65,7 @@ Define a Kubernetes cluster using the module `digital-ocean/container-linux/kube
```tf ```tf
module "digital-ocean-nemo" { module "digital-ocean-nemo" {
source = "git::https://github.com/poseidon/typhoon//digital-ocean/container-linux/kubernetes?ref=v1.16.0" source = "git::https://github.com/poseidon/typhoon//digital-ocean/container-linux/kubernetes?ref=v1.16.3"
# Digital Ocean # Digital Ocean
cluster_name = "nemo" cluster_name = "nemo"
@ -130,9 +130,9 @@ In 3-6 minutes, the Kubernetes cluster will be ready.
$ export KUBECONFIG=/home/user/.secrets/clusters/nemo/auth/kubeconfig $ export KUBECONFIG=/home/user/.secrets/clusters/nemo/auth/kubeconfig
$ kubectl get nodes $ kubectl get nodes
NAME STATUS ROLES AGE VERSION NAME STATUS ROLES AGE VERSION
10.132.110.130 Ready <none> 10m v1.16.0 10.132.110.130 Ready <none> 10m v1.16.3
10.132.115.81 Ready <none> 10m v1.16.0 10.132.115.81 Ready <none> 10m v1.16.3
10.132.124.107 Ready <none> 10m v1.16.0 10.132.124.107 Ready <none> 10m v1.16.3
``` ```
List the pods. List the pods.
@ -141,9 +141,9 @@ List the pods.
NAMESPACE NAME READY STATUS RESTARTS AGE NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-1187388186-ld1j7 1/1 Running 0 11m kube-system coredns-1187388186-ld1j7 1/1 Running 0 11m
kube-system coredns-1187388186-rdhf7 1/1 Running 0 11m kube-system coredns-1187388186-rdhf7 1/1 Running 0 11m
kube-system flannel-1cq1v 2/2 Running 0 11m kube-system calico-node-1m5bf 2/2 Running 0 11m
kube-system flannel-hq9t0 2/2 Running 0 11m kube-system calico-node-7jmr1 2/2 Running 0 11m
kube-system flannel-v0g9w 2/2 Running 0 11m kube-system calico-node-bknc8 2/2 Running 0 11m
kube-system kube-apiserver-ip-10.132.115.81 1/1 Running 0 11m kube-system kube-apiserver-ip-10.132.115.81 1/1 Running 0 11m
kube-system kube-controller-manager-ip-10.132.115.81 1/1 Running 0 11m kube-system kube-controller-manager-ip-10.132.115.81 1/1 Running 0 11m
kube-system kube-proxy-6kxjf 1/1 Running 0 11m kube-system kube-proxy-6kxjf 1/1 Running 0 11m
@ -167,11 +167,11 @@ Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/digital
| Name | Description | Example | | Name | Description | Example |
|:-----|:------------|:--------| |:-----|:------------|:--------|
| cluster_name | Unique cluster name (prepended to dns_zone) | nemo | | cluster_name | Unique cluster name (prepended to dns_zone) | "nemo" |
| region | Digital Ocean region | nyc1, sfo2, fra1, tor1 | | region | Digital Ocean region | "nyc1", "sfo2", "fra1", tor1" |
| dns_zone | Digital Ocean domain (i.e. DNS zone) | do.example.com | | dns_zone | Digital Ocean domain (i.e. DNS zone) | "do.example.com" |
| ssh_fingerprints | SSH public key fingerprints | ["d7:9d..."] | | ssh_fingerprints | SSH public key fingerprints | ["d7:9d..."] |
| asset_dir | Path to a directory where generated assets should be placed (contains secrets) | /home/user/.secrets/nemo | | asset_dir | Absolute path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/nemo" |
#### DNS Zone #### DNS Zone
@ -214,15 +214,14 @@ Digital Ocean requires the SSH public key be uploaded to your account, so you ma
|:-----|:------------|:--------|:--------| |:-----|:------------|:--------|:--------|
| controller_count | Number of controllers (i.e. masters) | 1 | 1 | | controller_count | Number of controllers (i.e. masters) | 1 | 1 |
| worker_count | Number of workers | 1 | 3 | | worker_count | Number of workers | 1 | 3 |
| controller_type | Droplet type for controllers | s-2vcpu-2gb | s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb, ... | | controller_type | Droplet type for controllers | "s-2vcpu-2gb" | s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb, ... |
| worker_type | Droplet type for workers | s-1vcpu-2gb | s-1vcpu-2gb, s-2vcpu-2gb, ... | | worker_type | Droplet type for workers | "s-1vcpu-2gb" | s-1vcpu-2gb, s-2vcpu-2gb, ... |
| image | Container Linux image for instances | "coreos-stable" | coreos-stable, coreos-beta, coreos-alpha | | image | Container Linux image for instances | "coreos-stable" | coreos-stable, coreos-beta, coreos-alpha |
| controller_clc_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/) | | controller_clc_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/) |
| worker_clc_snippets | Worker Container Linux Config snippets | [] | [example](/advanced/customization/) | | worker_clc_snippets | Worker Container Linux Config snippets | [] | [example](/advanced/customization/) |
| networking | Choice of networking provider | "flannel" | "flannel" or "calico" (experimental) | | networking | Choice of networking provider | "calico" | "flannel" or "calico" |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" | | pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" | | service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by coredns. | "cluster.local" | "k8s.example.com" |
Check the list of valid [droplet types](https://developers.digitalocean.com/documentation/changelog/api-v2/new-size-slugs-for-droplet-plan-changes/) or use `doctl compute size list`. Check the list of valid [droplet types](https://developers.digitalocean.com/documentation/changelog/api-v2/new-size-slugs-for-droplet-plan-changes/) or use `doctl compute size list`.

View File

@ -1,6 +1,6 @@
# Google Cloud # Google Cloud
In this tutorial, we'll create a Kubernetes v1.16.0 cluster on Google Compute Engine with Container Linux. In this tutorial, we'll create a Kubernetes v1.16.3 cluster on Google Compute Engine with Container Linux.
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a network, firewall rules, health checks, controller instances, worker managed instance group, load balancers, and TLS assets. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a network, firewall rules, health checks, controller instances, worker managed instance group, load balancers, and TLS assets.
@ -18,7 +18,7 @@ Install [Terraform](https://www.terraform.io/downloads.html) v0.12.x on your sys
```sh ```sh
$ terraform version $ terraform version
Terraform v0.12.7 Terraform v0.12.12
``` ```
Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name. Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
@ -49,10 +49,10 @@ Configure the Google Cloud provider to use your service account key, project-id,
```tf ```tf
provider "google" { provider "google" {
version = "2.15.0" version = "2.19.0"
project = "project-id" project = "project-id"
region = "us-central1" region = "us-central1"
credentials = "${file("~/.config/google-cloud/terraform.json")}" credentials = file("~/.config/google-cloud/terraform.json")
} }
provider "ct" { provider "ct" {
@ -71,7 +71,7 @@ Define a Kubernetes cluster using the module `google-cloud/container-linux/kuber
```tf ```tf
module "google-cloud-yavin" { module "google-cloud-yavin" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.16.0" source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.16.3"
# Google Cloud # Google Cloud
cluster_name = "yavin" cluster_name = "yavin"
@ -137,9 +137,9 @@ In 4-8 minutes, the Kubernetes cluster will be ready.
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig $ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
$ kubectl get nodes $ kubectl get nodes
NAME ROLES STATUS AGE VERSION NAME ROLES STATUS AGE VERSION
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.16.0 yavin-controller-0.c.example-com.internal <none> Ready 6m v1.16.3
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.16.0 yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.16.3
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.16.0 yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.16.3
``` ```
List the pods. List the pods.
@ -180,7 +180,7 @@ Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/google-
| dns_zone | Google Cloud DNS zone | "google-cloud.example.com" | | dns_zone | Google Cloud DNS zone | "google-cloud.example.com" |
| dns_zone_name | Google Cloud DNS zone name | "example-zone" | | dns_zone_name | Google Cloud DNS zone name | "example-zone" |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." | | ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." |
| asset_dir | Path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/yavin" | | asset_dir | Absolute path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/yavin" |
Check the list of valid [regions](https://cloud.google.com/compute/docs/regions-zones/regions-zones) and list Container Linux [images](https://cloud.google.com/compute/docs/images) with `gcloud compute images list | grep coreos`. Check the list of valid [regions](https://cloud.google.com/compute/docs/regions-zones/regions-zones) and list Container Linux [images](https://cloud.google.com/compute/docs/images) with `gcloud compute images list | grep coreos`.
@ -217,7 +217,7 @@ resource "google_dns_managed_zone" "zone-for-clusters" {
| networking | Choice of networking provider | "calico" | "calico" or "flannel" | | networking | Choice of networking provider | "calico" | "calico" or "flannel" |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" | | pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" | | service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by coredns. | "cluster.local" | "k8s.example.com" | | worker_node_labels | List of initial worker node labels | [] | ["worker-pool=default"] |
Check the list of valid [machine types](https://cloud.google.com/compute/docs/machine-types). Check the list of valid [machine types](https://cloud.google.com/compute/docs/machine-types).

View File

@ -3,7 +3,7 @@
!!! danger !!! danger
Typhoon for Fedora CoreOS is an early preview! Fedora CoreOS itself is a preview! Expect bugs and design shifts. Please help both projects solve problems. Report Fedora CoreOS bugs to [Fedora](https://github.com/coreos/fedora-coreos-tracker/issues). Report Typhoon issues to Typhoon. Typhoon for Fedora CoreOS is an early preview! Fedora CoreOS itself is a preview! Expect bugs and design shifts. Please help both projects solve problems. Report Fedora CoreOS bugs to [Fedora](https://github.com/coreos/fedora-coreos-tracker/issues). Report Typhoon issues to Typhoon.
In this tutorial, we'll create a Kubernetes v1.16.0 cluster on AWS with Fedora CoreOS. In this tutorial, we'll create a Kubernetes v1.16.3 cluster on AWS with Fedora CoreOS.
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a VPC, gateway, subnets, security groups, controller instances, worker auto-scaling group, network load balancer, and TLS assets. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a VPC, gateway, subnets, security groups, controller instances, worker auto-scaling group, network load balancer, and TLS assets.
@ -21,7 +21,7 @@ Install [Terraform](https://www.terraform.io/downloads.html) v0.12.x on your sys
```sh ```sh
$ terraform version $ terraform version
Terraform v0.12.7 Terraform v0.12.12
``` ```
Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name. Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
@ -52,8 +52,8 @@ Configure the AWS provider to use your access key credentials in a `providers.tf
```tf ```tf
provider "aws" { provider "aws" {
version = "2.29.0" version = "2.35.0"
region = "us-east-1" # MUST be us-east-1 right now! region = "eu-central-1"
shared_credentials_file = "/home/user/.config/aws/credentials" shared_credentials_file = "/home/user/.config/aws/credentials"
} }
@ -73,7 +73,7 @@ Define a Kubernetes cluster using the module `aws/fedora-coreos/kubernetes`.
```tf ```tf
module "aws-tempest" { module "aws-tempest" {
source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=DEVELOPMENT_SHA" source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.16.3"
# AWS # AWS
cluster_name = "tempest" cluster_name = "tempest"
@ -138,9 +138,9 @@ In 4-8 minutes, the Kubernetes cluster will be ready.
$ export KUBECONFIG=/home/user/.secrets/clusters/tempest/auth/kubeconfig $ export KUBECONFIG=/home/user/.secrets/clusters/tempest/auth/kubeconfig
$ kubectl get nodes $ kubectl get nodes
NAME STATUS ROLES AGE VERSION NAME STATUS ROLES AGE VERSION
ip-10-0-3-155 Ready <none> 10m v1.16.0 ip-10-0-3-155 Ready <none> 10m v1.16.3
ip-10-0-26-65 Ready <none> 10m v1.16.0 ip-10-0-26-65 Ready <none> 10m v1.16.3
ip-10-0-41-21 Ready <none> 10m v1.16.0 ip-10-0-41-21 Ready <none> 10m v1.16.3
``` ```
List the pods. List the pods.
@ -191,7 +191,7 @@ resource "aws_route53_zone" "zone-for-clusters" {
} }
``` ```
Reference the DNS zone id with `"${aws_route53_zone.zone-for-clusters.zone_id}"`. Reference the DNS zone id with `aws_route53_zone.zone-for-clusters.zone_id`.
!!! tip "" !!! tip ""
If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on Route53 (e.g. aws.mydomain.com) and [update nameservers](http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/SOA-NSrecords.html). If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on Route53 (e.g. aws.mydomain.com) and [update nameservers](http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/SOA-NSrecords.html).
@ -205,11 +205,11 @@ Reference the DNS zone id with `"${aws_route53_zone.zone-for-clusters.zone_id}"`
| controller_type | EC2 instance type for controllers | "t3.small" | See below | | controller_type | EC2 instance type for controllers | "t3.small" | See below |
| worker_type | EC2 instance type for workers | "t3.small" | See below | | worker_type | EC2 instance type for workers | "t3.small" | See below |
| os_image | AMI channel for Fedora CoreOS | not yet used | ? | | os_image | AMI channel for Fedora CoreOS | not yet used | ? |
| disk_size | Size of the EBS volume in GB | "40" | "100" | | disk_size | Size of the EBS volume in GB | 40 | 100 |
| disk_type | Type of the EBS volume | "gp2" | standard, gp2, io1 | | disk_type | Type of the EBS volume | "gp2" | standard, gp2, io1 |
| disk_iops | IOPS of the EBS volume | "0" (i.e. auto) | "400" | | disk_iops | IOPS of the EBS volume | 0 (i.e. auto) | 400 |
| worker_target_groups | Target group ARNs to which worker instances should be added | [] | ["${aws_lb_target_group.app.id}"] | | worker_target_groups | Target group ARNs to which worker instances should be added | [] | [aws_lb_target_group.app.id] |
| worker_price | Spot price in USD for workers. Leave as default empty string for regular on-demand instances | "" | "0.10" | | worker_price | Spot price in USD for worker instances or 0 to use on-demand instances | 0 | 0.10 |
| controller_snippets | Controller Fedora CoreOS Config snippets | [] | UNSUPPORTED | | controller_snippets | Controller Fedora CoreOS Config snippets | [] | UNSUPPORTED |
| worker_clc_snippets | Worker Fedora CoreOS Config snippets | [] | UNSUPPORTED | | worker_clc_snippets | Worker Fedora CoreOS Config snippets | [] | UNSUPPORTED |
| networking | Choice of networking provider | "calico" | "calico" or "flannel" | | networking | Choice of networking provider | "calico" | "calico" or "flannel" |
@ -217,7 +217,7 @@ Reference the DNS zone id with `"${aws_route53_zone.zone-for-clusters.zone_id}"`
| host_cidr | CIDR IPv4 range to assign to EC2 instances | "10.0.0.0/16" | "10.1.0.0/16" | | host_cidr | CIDR IPv4 range to assign to EC2 instances | "10.0.0.0/16" | "10.1.0.0/16" |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" | | pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" | | service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by coredns. | "cluster.local" | "k8s.example.com" | | worker_node_labels | List of initial worker node labels | [] | ["worker-pool=default"] |
Check the list of valid [instance types](https://aws.amazon.com/ec2/instance-types/). Check the list of valid [instance types](https://aws.amazon.com/ec2/instance-types/).

View File

@ -3,7 +3,7 @@
!!! danger !!! danger
Typhoon for Fedora CoreOS is an early preview! Fedora CoreOS itself is a preview! Expect bugs and design shifts. Please help both projects solve problems. Report Fedora CoreOS bugs to [Fedora](https://github.com/coreos/fedora-coreos-tracker/issues). Report Typhoon issues to Typhoon. Typhoon for Fedora CoreOS is an early preview! Fedora CoreOS itself is a preview! Expect bugs and design shifts. Please help both projects solve problems. Report Fedora CoreOS bugs to [Fedora](https://github.com/coreos/fedora-coreos-tracker/issues). Report Typhoon issues to Typhoon.
In this tutorial, we'll network boot and provision a Kubernetes v1.16.0 cluster on bare-metal with Fedora CoreOS. In this tutorial, we'll network boot and provision a Kubernetes v1.16.3 cluster on bare-metal with Fedora CoreOS.
First, we'll deploy a [Matchbox](https://github.com/poseidon/matchbox) service and setup a network boot environment. Then, we'll declare a Kubernetes cluster using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Fedora CoreOS to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers via Ignition. First, we'll deploy a [Matchbox](https://github.com/poseidon/matchbox) service and setup a network boot environment. Then, we'll declare a Kubernetes cluster using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Fedora CoreOS to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers via Ignition.
@ -114,7 +114,7 @@ Install [Terraform](https://www.terraform.io/downloads.html) v0.12.x on your sys
```sh ```sh
$ terraform version $ terraform version
Terraform v0.12.7 Terraform v0.12.12
``` ```
Add the [terraform-provider-matchbox](https://github.com/poseidon/terraform-provider-matchbox) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name. Add the [terraform-provider-matchbox](https://github.com/poseidon/terraform-provider-matchbox) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
@ -147,9 +147,9 @@ Configure the Matchbox provider to use your Matchbox API endpoint and client cer
provider "matchbox" { provider "matchbox" {
version = "0.3.0" version = "0.3.0"
endpoint = "matchbox.example.com:8081" endpoint = "matchbox.example.com:8081"
client_cert = "${file("~/.config/matchbox/client.crt")}" client_cert = file("~/.config/matchbox/client.crt")
client_key = "${file("~/.config/matchbox/client.key")}" client_key = file("~/.config/matchbox/client.key")
ca = "${file("~/.config/matchbox/ca.crt")}" ca = file("~/.config/matchbox/ca.crt")
} }
provider "ct" { provider "ct" {
@ -163,14 +163,14 @@ Define a Kubernetes cluster using the module `bare-metal/fedora-coreos/kubernete
```tf ```tf
module "bare-metal-mercury" { module "bare-metal-mercury" {
source = "git::https://github.com/poseidon/typhoon//bare-metal/fedora-coreos/kubernetes?ref=DEVELOPMENT_SHA" source = "git::https://github.com/poseidon/typhoon//bare-metal/fedora-coreos/kubernetes?ref=v1.16.3"
# bare-metal # bare-metal
cluster_name = "mercury" cluster_name = "mercury"
matchbox_http_endpoint = "http://matchbox.example.com" matchbox_http_endpoint = "http://matchbox.example.com"
os_stream = "testing" os_stream = "testing"
os_version = "30.20190801.0" os_version = "30.20191002.0"
cached_install = "true" cached_install = true
# configuration # configuration
k8s_domain_name = "node1.example.com" k8s_domain_name = "node1.example.com"
@ -178,20 +178,22 @@ module "bare-metal-mercury" {
asset_dir = "/home/user/.secrets/clusters/mercury" asset_dir = "/home/user/.secrets/clusters/mercury"
# machines # machines
controller_names = ["node1"] controllers = [{
controller_macs = ["52:54:00:a1:9c:ae"] name = "node1"
controller_domains = ["node1.example.com"] mac = "52:54:00:a1:9c:ae"
worker_names = [ domain = "node1.example.com"
"node2", }]
"node3", workers = [
] {
worker_macs = [ name = "node2",
"52:54:00:b2:2f:86", mac = "52:54:00:b2:2f:86"
"52:54:00:c3:61:77", domain = "node2.example.com"
] },
worker_domains = [ {
"node2.example.com", name = "node3",
"node3.example.com", mac = "52:54:00:c3:61:77"
domain = "node3.example.com"
}
] ]
} }
``` ```
@ -283,9 +285,9 @@ systemd[1]: Started Kubernetes control plane.
$ export KUBECONFIG=/home/user/.secrets/clusters/mercury/auth/kubeconfig $ export KUBECONFIG=/home/user/.secrets/clusters/mercury/auth/kubeconfig
$ kubectl get nodes $ kubectl get nodes
NAME STATUS ROLES AGE VERSION NAME STATUS ROLES AGE VERSION
node1.example.com Ready <none> 10m v1.16.0 node1.example.com Ready <none> 10m v1.16.3
node2.example.com Ready <none> 10m v1.16.0 node2.example.com Ready <none> 10m v1.16.3
node3.example.com Ready <none> 10m v1.16.0 node3.example.com Ready <none> 10m v1.16.3
``` ```
List the pods. List the pods.
@ -318,19 +320,15 @@ Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/bare-me
| Name | Description | Example | | Name | Description | Example |
|:-----|:------------|:--------| |:-----|:------------|:--------|
| cluster_name | Unique cluster name | mercury | | cluster_name | Unique cluster name | "mercury" |
| matchbox_http_endpoint | Matchbox HTTP read-only endpoint | http://matchbox.example.com:port | | matchbox_http_endpoint | Matchbox HTTP read-only endpoint | "http://matchbox.example.com:port" |
| os_stream | Fedora CoreOS release stream | testing | | os_stream | Fedora CoreOS release stream | "testing" |
| os_version | Fedora CoreOS version to PXE and install | 30.20190716.1 | | os_version | Fedora CoreOS version to PXE and install | "30.20190716.1" |
| k8s_domain_name | FQDN resolving to the controller(s) nodes. Workers and kubectl will communicate with this endpoint | "myk8s.example.com" | | k8s_domain_name | FQDN resolving to the controller(s) nodes. Workers and kubectl will communicate with this endpoint | "myk8s.example.com" |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3Nz..." | | ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3Nz..." |
| asset_dir | Absolute path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/mercury" | | asset_dir | Absolute path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/mercury" |
| controller_names | Ordered list of controller short names | ["node1"] | | controllers | List of controller machine detail objects (unique name, identifying MAC address, FQDN) | `[{name="node1", mac="52:54:00:a1:9c:ae", domain="node1.example.com"}]` |
| controller_macs | Ordered list of controller identifying MAC addresses | ["52:54:00:a1:9c:ae"] | | workers | List of worker machine detail objects (unique name, identifying MAC address, FQDN) | `[{name="node2", mac="52:54:00:b2:2f:86", domain="node2.example.com"}, {name="node3", mac="52:54:00:c3:61:77", domain="node3.example.com"}]` |
| controller_domains | Ordered list of controller FQDNs | ["node1.example.com"] |
| worker_names | Ordered list of worker short names | ["node2", "node3"] |
| worker_macs | Ordered list of worker identifying MAC addresses | ["52:54:00:b2:2f:86", "52:54:00:c3:61:77"] |
| worker_domains | Ordered list of worker FQDNs | ["node2.example.com", "node3.example.com"] |
### Optional ### Optional
@ -341,9 +339,8 @@ Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/bare-me
| networking | Choice of networking provider | "calico" | "calico" or "flannel" | | networking | Choice of networking provider | "calico" | "calico" or "flannel" |
| network_mtu | CNI interface MTU (calico-only) | 1480 | - | | network_mtu | CNI interface MTU (calico-only) | 1480 | - |
| snippets | Map from machine names to lists of Fedora CoreOS Config snippets | {} | UNSUPPORTED | | snippets | Map from machine names to lists of Fedora CoreOS Config snippets | {} | UNSUPPORTED |
| network_ip_autodetection_method | Method to detect host IPv4 address (calico-only) | first-found | can-reach=10.0.0.1 | | network_ip_autodetection_method | Method to detect host IPv4 address (calico-only) | "first-found" | "can-reach=10.0.0.1" |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" | | pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" | | service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by coredns. | "cluster.local" | "k8s.example.com" | | kernel_args | Additional kernel args to provide at PXE boot | [] | ["kvm-intel.nested=1"] |
| kernel_args | Additional kernel args to provide at PXE boot | [] | "kvm-intel.nested=1" |

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a> ## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.16.0 (upstream) * Kubernetes v1.16.3 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking * Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](advanced/worker-pools/), [preemptible](cl/google-cloud/#preemption) workers, and [snippets](advanced/customization/#container-linux) customization * Advanced features like [worker pools](advanced/worker-pools/), [preemptible](cl/google-cloud/#preemption) workers, and [snippets](advanced/customization/#container-linux) customization
@ -47,7 +47,7 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo
```tf ```tf
module "google-cloud-yavin" { module "google-cloud-yavin" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.16.0" source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.16.3"
# Google Cloud # Google Cloud
cluster_name = "yavin" cluster_name = "yavin"
@ -80,9 +80,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig $ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
$ kubectl get nodes $ kubectl get nodes
NAME ROLES STATUS AGE VERSION NAME ROLES STATUS AGE VERSION
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.16.0 yavin-controller-0.c.example-com.internal <none> Ready 6m v1.16.3
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.16.0 yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.16.3
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.16.0 yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.16.3
``` ```
List the pods. List the pods.

View File

@ -6,6 +6,39 @@ Typhoon ensures certain networking hardware integrates well with bare-metal Kube
Ubiquiti EdgeRouters and EdgeOS work well with bare-metal Kubernetes clusters. Familiarity with EdgeRouter setup and CLI usage is required. Ubiquiti EdgeRouters and EdgeOS work well with bare-metal Kubernetes clusters. Familiarity with EdgeRouter setup and CLI usage is required.
### DHCP
Assign static IPs to clients with known MAC addresses. This is called a static mapping by EdgeOS. Configure the router with the commands based on region inventory.
```
configure
show service dhcp-server shared-network
set service dhcp-server shared-network-name LAN subnet SUBNET static-mapping NAME mac-address MACADDR
set service dhcp-server shared-network-name LAN subnet SUBNET static-mapping NAME ip-address 10.0.0.20
```
### DNS
Add DNS A records to static IPs as `dnsmasq` host-records.
```
configure
set service dns forwarding options host-record=node.example.com,10.0.0.20
```
Forward `*.svc.cluster.local` queries to the CoreDNS Kubernetes service IP to allow clients to resolve Kubernetes services.
```
set service dns forwarding options server=/svc.cluster.local/10.3.0.10
commit-confirm
```
Restart `dnsmasq`.
```
sudo /etc/init.d/dnsmasq restart
```
### PXE ### PXE
Ubiquiti EdgeRouters can provide a PXE-enabled network boot environment for client machines. Ubiquiti EdgeRouters can provide a PXE-enabled network boot environment for client machines.
@ -65,55 +98,88 @@ set service dns forwarding options tftp-root=/config/tftpboot
commit-confirm commit-confirm
``` ```
### DHCP ### Routing
Assign static IPs to clients with known MAC addresses. This is called a static mapping by EdgeOS. Configure the router with the commands based on region inventory. #### Static Routes
``` Add static route(s) to Kubernetes node(s) that can route to Kubernetes service IPs (default: 10.3.0.0/16). Kubernetes service IPs will become routeable on the LAN.
configure
show service dhcp-server shared-network
set service dhcp-server shared-network-name LAN subnet SUBNET static-mapping NAME mac-address MACADDR
set service dhcp-server shared-network-name LAN subnet SUBNET static-mapping NAME ip-address 10.0.0.20
```
### DNS
Assign DNS A records to nodes as options to `dnsmasq`.
```
configure
set service dns forwarding options host-record=node.example.com,10.0.0.20
```
Restart `dnsmasq`.
```
sudo /etc/init.d/dnsmasq restart
```
Configure queries for `*.svc.cluster.local` to be forwarded to the Kubernetes `coredns` service IP to allow hosts to resolve cluster-local Kubernetes names.
```
configure
show service dns forwarding
set service dns forwarding options server=/svc.cluster.local/10.3.0.10
commit-confirm
```
### Kubernetes Services
Add static routes for the Kubernetes IPv4 service range to Kubernetes node(s) so hosts can route to Kubernetes services (default: 10.3.0.0/16).
``` ```
configure configure
show protocols static route show protocols static route
set protocols static route 10.3.0.0/16 next-hop NODE_IP set protocols static route 10.3.0.0/16 next-hop NODE_IP
...
commit-confirm commit-confirm
``` ```
!!! note !!! note
Adding multiple next-hop nodes provides equal-cost multi-path (ECMP) routing. EdgeOS v2.0+ is required. The kernel in prior versions used flow-hash to balanced packets, whereas with v2.0, round-robin sessions are used. Adding multiple next-hop nodes provides equal-cost multi-path (ECMP) routing. EdgeOS v2.0+ is required. The kernel in prior versions used flow-hash to balanced packets, whereas with v2.0, round-robin sessions are used.
#### BGP
EdgeRouter can exchange routes with other autonomous systems, including a cluster's Calico AS. Peers will exchange `podCIDR` routes to make individual pods routeable on the LAN.
Define the EdgeRouter AS (if undefined).
```
configure
show protocols bgp 1
set protocols bgp 1 parameters router-id ROUTER_IP
```
Peer with node(s) in another AS (eg. Calico default 64512)
```
set protocols bgp 1 neighbor NODE1_IP remote-as 64512
set protocols bgp 1 neighbor NODE2_IP remote-as 64512
set protocols bgp 1 neighbor NODE3_IP remote-as 64512
commit-confirm
```
Configure Calico node(s) as to peer with the EdgeRouter.
```
apiVersion: crd.projectcalico.org/v1
kind: BGPPeer
metadata:
name: NODE_NAME-to-edgerouter
spec:
peerIP: ROUTER_IP
asNumber: 1
node: NODE_NAME
```
Or, if every node is to be peered (i.e. full mesh), define a global BGPPeer.
```
apiVersion: crd.projectcalico.org/v1
kind: BGPPeer
metadata:
name: global
spec:
peerIP: ROUTER_IP
asNumber: 1
```
If Calico nodes should advertise Kubernetes Service IPs (i.e. ClusterIPs) as well, add a `BGPConfiguration`.
```
apiVersion: crd.projectcalico.org/v1
kind: BGPConfiguration
metadata:
name: default
spec:
logSeverityScreen: Info
nodeToNodeMeshEnabled: true
serviceClusterIPs:
- cidr: 10.3.0.0/16
```
Show a summary of peers and exchanged routes.
```
show ip bgp summary
show ip route bgp
```
### Port Forwarding ### Port Forwarding
@ -150,35 +216,3 @@ set service gui https-port 4443
commit-confirm commit-confirm
``` ```
### BGP
Add the EdgeRouter as a global BGP peer for nodes in a Kubernetes cluster (requires Calico). Neighbors will exchange `podCIDR` routes and individual pods will become routable on the LAN.
Configure node(s) as BGP neighbors.
```
show protocols bgp 1
set protocols bgp 1 parameters router-id LAN_IP
set protocols bgp 1 neighbor NODE1_IP remote-as 64512
set protocols bgp 1 neighbor NODE2_IP remote-as 64512
set protocols bgp 1 neighbor NODE3_IP remote-as 64512
```
View the neighbors and exchanged routes.
```
show ip bgp neighbors
show ip route bgp
```
Be sure to register the peer by creating a Calico `BGPPeer` CRD with `kubectl apply`.
```
apiVersion: crd.projectcalico.org/v1
kind: BGPPeer
metadata:
name: NAME
spec:
peerIP: LAN_IP
asNumber: 64512
```

View File

@ -18,7 +18,7 @@ module "google-cloud-yavin" {
} }
module "bare-metal-mercury" { module "bare-metal-mercury" {
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.16.0" source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.16.3"
... ...
} }
``` ```
@ -279,15 +279,15 @@ Typhoon modules have been adapted for Terraform v0.12. Provider plugins requirem
| Typhoon Release | Terraform version | | Typhoon Release | Terraform version |
|-------------------|---------------------| |-------------------|---------------------|
| v1.16.0 - ? | v0.12.x | | v1.16.3 - ? | v0.12.x |
| v1.10.3 - v1.16.0 | v0.11.x | | v1.10.3 - v1.16.3 | v0.11.x |
| v1.9.2 - v1.10.2 | v0.10.4+ or v0.11.x | | v1.9.2 - v1.10.2 | v0.10.4+ or v0.11.x |
| v1.7.3 - v1.9.1 | v0.10.x | | v1.7.3 - v1.9.1 | v0.10.x |
| v1.6.4 - v1.7.2 | v0.9.x | | v1.6.4 - v1.7.2 | v0.9.x |
### New users ### New users
New users can start with Terraform v0.12.x and follow the docs for Typhoon v1.16.0+ without issue. New users can start with Terraform v0.12.x and follow the docs for Typhoon v1.16.3+ without issue.
### Existing users ### Existing users
@ -404,7 +404,7 @@ tree .
└── infraB <- new Terraform v0.12.x configs └── infraB <- new Terraform v0.12.x configs
``` ```
Define Typhoon clusters in the new config directory using Terraform v0.12 syntax. Follow the Typhoon v1.16.0+ docs (e.g. use `terraform12` in the `infraB` dir). See [AWS](/cl/aws), [Azure](/cl/azure), [Bare-Metal](/cl/bare-metal), [Digital Ocean](/cl/digital-ocean), or [Google-Cloud](/cl/google-cloud)) to create new clusters. Follow the usual [upgrade](/topics/maintenance/#upgrades) process to apply workloads and shift traffic. Later, switch back to the old config directory and deprovision clusters with Terraform v0.11. Define Typhoon clusters in the new config directory using Terraform v0.12 syntax. Follow the Typhoon v1.16.3+ docs (e.g. use `terraform12` in the `infraB` dir). See [AWS](/cl/aws), [Azure](/cl/azure), [Bare-Metal](/cl/bare-metal), [Digital Ocean](/cl/digital-ocean), or [Google-Cloud](/cl/google-cloud)) to create new clusters. Follow the usual [upgrade](/topics/maintenance/#upgrades) process to apply workloads and shift traffic. Later, switch back to the old config directory and deprovision clusters with Terraform v0.11.
```shell ```shell
terraform12 init terraform12 init

View File

@ -12,7 +12,7 @@ Typhoon aims to be minimal and secure. We're running it ourselves after all.
* Workloads run on worker nodes only, unless they tolerate the master taint * Workloads run on worker nodes only, unless they tolerate the master taint
* Kubernetes [Network Policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) and Calico [NetworkPolicy](https://docs.projectcalico.org/latest/reference/calicoctl/resources/networkpolicy) support [^1] * Kubernetes [Network Policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) and Calico [NetworkPolicy](https://docs.projectcalico.org/latest/reference/calicoctl/resources/networkpolicy) support [^1]
[^1]: Requires `networking = "calico"`. Calico is the default on AWS, bare-metal, and Google Cloud. Azure and Digital Ocean are limited to `networking = "flannel"`. [^1]: Requires `networking = "calico"`. Calico is the default on all platforms (AWS, Azure, bare-metal, DigitalOcean, and Google Cloud).
**Hosts** **Hosts**

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a> ## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.16.0 (upstream) * Kubernetes v1.16.3 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking * Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [preemptible](https://typhoon.psdn.io/cl/google-cloud/#preemption) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization * Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [preemptible](https://typhoon.psdn.io/cl/google-cloud/#preemption) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests) # Kubernetes assets (kubeconfig, manifests)
module "bootstrap" { module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=539b725093c8cd94ba46603adb25ac5280562ec8" source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=0daa1276c633fea28e41b2c2c18831e2584deb24"
cluster_name = var.cluster_name cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)] api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]

View File

@ -7,7 +7,7 @@ systemd:
- name: 40-etcd-cluster.conf - name: 40-etcd-cluster.conf
contents: | contents: |
[Service] [Service]
Environment="ETCD_IMAGE_TAG=v3.4.0" Environment="ETCD_IMAGE_TAG=v3.4.3"
Environment="ETCD_NAME=${etcd_name}" Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379" Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380" Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
@ -112,7 +112,7 @@ systemd:
--volume script,kind=host,source=/opt/bootstrap/apply \ --volume script,kind=host,source=/opt/bootstrap/apply \
--mount volume=script,target=/apply \ --mount volume=script,target=/apply \
--insecure-options=image \ --insecure-options=image \
docker://k8s.gcr.io/hyperkube:v1.16.0 \ docker://k8s.gcr.io/hyperkube:v1.16.3 \
--net=host \ --net=host \
--dns=host \ --dns=host \
--exec=/apply --exec=/apply
@ -133,7 +133,7 @@ storage:
contents: contents:
inline: | inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.16.0 KUBELET_IMAGE_TAG=v1.16.3
- path: /opt/bootstrap/apply - path: /opt/bootstrap/apply
filesystem: root filesystem: root
mode: 0544 mode: 0544

View File

@ -23,45 +23,45 @@ variable "dns_zone_name" {
# instances # instances
variable "controller_count" { variable "controller_count" {
type = string type = number
default = "1"
description = "Number of controllers (i.e. masters)" description = "Number of controllers (i.e. masters)"
default = 1
} }
variable "worker_count" { variable "worker_count" {
type = string type = number
default = "1"
description = "Number of workers" description = "Number of workers"
default = 1
} }
variable "controller_type" { variable "controller_type" {
type = string type = string
default = "n1-standard-1"
description = "Machine type for controllers (see `gcloud compute machine-types list`)" description = "Machine type for controllers (see `gcloud compute machine-types list`)"
default = "n1-standard-1"
} }
variable "worker_type" { variable "worker_type" {
type = string type = string
default = "n1-standard-1"
description = "Machine type for controllers (see `gcloud compute machine-types list`)" description = "Machine type for controllers (see `gcloud compute machine-types list`)"
default = "n1-standard-1"
} }
variable "os_image" { variable "os_image" {
type = string type = string
default = "coreos-stable"
description = "Container Linux image for compute instances (e.g. coreos-stable)" description = "Container Linux image for compute instances (e.g. coreos-stable)"
default = "coreos-stable"
} }
variable "disk_size" { variable "disk_size" {
type = string type = number
default = "40"
description = "Size of the disk in GB" description = "Size of the disk in GB"
default = 40
} }
variable "worker_preemptible" { variable "worker_preemptible" {
type = string type = bool
default = "false"
description = "If enabled, Compute Engine will terminate workers randomly within 24 hours" description = "If enabled, Compute Engine will terminate workers randomly within 24 hours"
default = false
} }
variable "controller_clc_snippets" { variable "controller_clc_snippets" {
@ -84,48 +84,55 @@ variable "ssh_authorized_key" {
} }
variable "asset_dir" { variable "asset_dir" {
description = "Path to a directory where generated assets should be placed (contains secrets)"
type = string type = string
description = "Absolute path to a directory where generated assets should be placed (contains secrets)"
} }
variable "networking" { variable "networking" {
description = "Choice of networking provider (flannel or calico)"
type = string type = string
description = "Choice of networking provider (flannel or calico)"
default = "calico" default = "calico"
} }
variable "pod_cidr" { variable "pod_cidr" {
description = "CIDR IPv4 range to assign Kubernetes pods"
type = string type = string
description = "CIDR IPv4 range to assign Kubernetes pods"
default = "10.2.0.0/16" default = "10.2.0.0/16"
} }
variable "service_cidr" { variable "service_cidr" {
type = string
description = <<EOD description = <<EOD
CIDR IPv4 range to assign Kubernetes services. CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns. The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns.
EOD EOD
type = string
default = "10.3.0.0/16" default = "10.3.0.0/16"
} }
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = string
default = "cluster.local"
}
variable "enable_reporting" { variable "enable_reporting" {
type = string type = bool
description = "Enable usage or analytics reporting to upstreams (Calico)" description = "Enable usage or analytics reporting to upstreams (Calico)"
default = "false" default = false
} }
variable "enable_aggregation" { variable "enable_aggregation" {
type = bool
description = "Enable the Kubernetes Aggregation Layer (defaults to false)" description = "Enable the Kubernetes Aggregation Layer (defaults to false)"
type = string default = false
default = "false" }
variable "worker_node_labels" {
type = list(string)
description = "List of initial worker node labels"
default = []
}
# unofficial, undocumented, unsupported
variable "cluster_domain_suffix" {
type = string
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
default = "cluster.local"
} }

View File

@ -18,5 +18,6 @@ module "workers" {
service_cidr = var.service_cidr service_cidr = var.service_cidr
cluster_domain_suffix = var.cluster_domain_suffix cluster_domain_suffix = var.cluster_domain_suffix
clc_snippets = var.worker_clc_snippets clc_snippets = var.worker_clc_snippets
node_labels = var.worker_node_labels
} }

View File

@ -60,6 +60,9 @@ systemd:
--lock-file=/var/run/lock/kubelet.lock \ --lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni \ --network-plugin=cni \
--node-labels=node.kubernetes.io/node \ --node-labels=node.kubernetes.io/node \
%{ for label in split(",", node_labels) }
--node-labels=${label} \
%{ endfor ~}
--pod-manifest-path=/etc/kubernetes/manifests \ --pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \ --read-only-port=0 \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins --volume-plugin-dir=/var/lib/kubelet/volumeplugins
@ -94,7 +97,7 @@ storage:
contents: contents:
inline: | inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.16.0 KUBELET_IMAGE_TAG=v1.16.3
- path: /etc/sysctl.d/max-user-watches.conf - path: /etc/sysctl.d/max-user-watches.conf
filesystem: root filesystem: root
contents: contents:
@ -112,7 +115,7 @@ storage:
--volume config,kind=host,source=/etc/kubernetes \ --volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \ --mount volume=config,target=/etc/kubernetes \
--insecure-options=image \ --insecure-options=image \
docker://k8s.gcr.io/hyperkube:v1.16.0 \ docker://k8s.gcr.io/hyperkube:v1.16.3 \
--net=host \ --net=host \
--dns=host \ --dns=host \
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname) --exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)

View File

@ -23,33 +23,39 @@ variable "network" {
# instances # instances
variable "worker_count" { variable "worker_count" {
type = string type = number
default = "1"
description = "Number of worker compute instances the instance group should manage" description = "Number of worker compute instances the instance group should manage"
default = 1
} }
variable "machine_type" { variable "machine_type" {
type = string type = string
default = "n1-standard-1"
description = "Machine type for compute instances (e.g. gcloud compute machine-types list)" description = "Machine type for compute instances (e.g. gcloud compute machine-types list)"
default = "n1-standard-1"
} }
variable "os_image" { variable "os_image" {
type = string type = string
default = "coreos-stable"
description = "Container Linux image for compute instanges (e.g. gcloud compute images list)" description = "Container Linux image for compute instanges (e.g. gcloud compute images list)"
default = "coreos-stable"
} }
variable "disk_size" { variable "disk_size" {
type = string type = number
default = "40"
description = "Size of the disk in GB" description = "Size of the disk in GB"
default = 40
} }
variable "preemptible" { variable "preemptible" {
type = string type = bool
default = "false"
description = "If enabled, Compute Engine will terminate instances randomly within 24 hours" description = "If enabled, Compute Engine will terminate instances randomly within 24 hours"
default = false
}
variable "clc_snippets" {
type = list(string)
description = "Container Linux Config snippets"
default = []
} }
# configuration # configuration
@ -65,30 +71,28 @@ variable "ssh_authorized_key" {
} }
variable "service_cidr" { variable "service_cidr" {
type = string
description = <<EOD description = <<EOD
CIDR IPv4 range to assign Kubernetes services. CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns. The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns.
EOD EOD
type = string
default = "10.3.0.0/16" default = "10.3.0.0/16"
} }
variable "cluster_domain_suffix" { variable "node_labels" {
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = string
default = "cluster.local"
}
variable "clc_snippets" {
type = list(string) type = list(string)
description = "Container Linux Config snippets" description = "List of initial node labels"
default = [] default = []
} }
# unofficial, undocumented, unsupported, temporary # unofficial, undocumented, unsupported, temporary
variable "cluster_domain_suffix" {
type = string
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
default = "cluster.local"
}
variable "accelerator_type" { variable "accelerator_type" {
type = string type = string
default = "" default = ""

View File

@ -82,6 +82,7 @@ data "template_file" "worker-config" {
ssh_authorized_key = var.ssh_authorized_key ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10) cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix cluster_domain_suffix = var.cluster_domain_suffix
node_labels = join(",", var.node_labels)
} }
} }

View File

@ -1,4 +1,4 @@
mkdocs==1.0.4 mkdocs==1.0.4
mkdocs-material==4.4.2 mkdocs-material==4.4.3
pygments==2.2.0 pygments==2.2.0
pymdown-extensions==5.0.0 pymdown-extensions==5.0.0