mirror of
https://github.com/puppetmaster/typhoon.git
synced 2025-08-04 03:41:34 +02:00
Compare commits
122 Commits
Author | SHA1 | Date | |
---|---|---|---|
eb29fb639b | |||
fcbdb50d93 | |||
efac611e9c | |||
87ff431b80 | |||
0d8ceae1d9 | |||
c5cf803634 | |||
61ee01f462 | |||
cbef202eec | |||
0c99b909a9 | |||
739db3b35f | |||
c68b035a63 | |||
1a5949824c | |||
9bac641511 | |||
37ff3c28eb | |||
f03045f0dc | |||
b603bbde3d | |||
810236f6df | |||
3c3d3a2473 | |||
1af9fd8094 | |||
c734fa7b84 | |||
fdade5b40c | |||
171fd2c998 | |||
545bd79624 | |||
12b825c78f | |||
66e7354c8a | |||
3a71b2ccb1 | |||
c7e327417b | |||
e313e733ab | |||
d0e73b8174 | |||
65ddd2419c | |||
b0e9b1fa60 | |||
485feb82c4 | |||
0b276b6b7e | |||
e8513e58bb | |||
d77343be3a | |||
f2b01e1d75 | |||
60c2107d7f | |||
30cfeec6c1 | |||
ba8774ee0d | |||
24e63bd134 | |||
996bdd9112 | |||
a34d78f55d | |||
04b2e149ba | |||
9f0126a410 | |||
a1bab9c96e | |||
966fd280b0 | |||
e4e074c894 | |||
d51da49925 | |||
2076a779a3 | |||
048094b256 | |||
75b063c586 | |||
1620d1e456 | |||
939bffbf98 | |||
bc96443710 | |||
82a7422b3d | |||
132ab395a5 | |||
5f87eb3ec9 | |||
b152b9f973 | |||
9c842395a8 | |||
6cb9c0341b | |||
d4fd6d4adb | |||
3664dfafc2 | |||
e535ddd15a | |||
5752a8f041 | |||
68abbf7b0d | |||
67047ead08 | |||
c11e23fc50 | |||
b647ad8806 | |||
2eb1ac1b4d | |||
cb2721ef7d | |||
fc06d28e13 | |||
a9078cb52b | |||
ebd9570ede | |||
34e8db7aae | |||
084e8bea49 | |||
d73621c838 | |||
1a6481df04 | |||
798ec9a92f | |||
96aed4c3c3 | |||
7372d33af8 | |||
451ec771a8 | |||
4d9846b83e | |||
597ca4acce | |||
507c646e8b | |||
d8f7da6873 | |||
048f1f514e | |||
b825cd9afe | |||
796149d122 | |||
a66bccd590 | |||
30b1edfcc6 | |||
a4afe06b64 | |||
4d58be0816 | |||
170b768ad8 | |||
5bc1cd28c3 | |||
13fbac6c79 | |||
a8fa4a9a06 | |||
a5c1a96df1 | |||
6a091e245e | |||
590796ee62 | |||
ec389295fe | |||
3c807f3478 | |||
e76fe80b45 | |||
32853aaa7b | |||
c32a54db40 | |||
9671b1c734 | |||
3b933e1ab3 | |||
58d8f6f505 | |||
56853fe222 | |||
18165d8076 | |||
50acf28ce5 | |||
ab793eb842 | |||
b74c958524 | |||
2024d3c32e | |||
11c434915f | |||
05f7df9e80 | |||
4220b9ce18 | |||
6a6af4aa16 | |||
3dcd10f3b8 | |||
22503993b9 | |||
cf3aa8885b | |||
ba61a137db | |||
646bdd78e4 |
1
.github/FUNDING.yml
vendored
Normal file
1
.github/FUNDING.yml
vendored
Normal file
@ -0,0 +1 @@
|
||||
github: [poseidon]
|
9
.github/dependabot.yaml
vendored
Normal file
9
.github/dependabot.yaml
vendored
Normal file
@ -0,0 +1,9 @@
|
||||
version: 2
|
||||
updates:
|
||||
- package-ecosystem: pip
|
||||
directory: "/"
|
||||
schedule:
|
||||
interval: weekly
|
||||
pull-request-branch-name:
|
||||
separator: "-"
|
||||
open-pull-requests-limit: 3
|
232
CHANGES.md
232
CHANGES.md
@ -2,6 +2,238 @@
|
||||
|
||||
Notable changes between versions.
|
||||
|
||||
## Latest
|
||||
|
||||
## v1.22.1
|
||||
|
||||
* Kubernetes [v1.22.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1221)
|
||||
* Update Calico from v3.19.1 to [v3.20.0](https://github.com/projectcalico/calico/releases/tag/v3.20.0)
|
||||
|
||||
### Addons
|
||||
|
||||
* Update nginx-ingress from v1.0.0-beta.1 to [v1.0.0](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.0.0)
|
||||
* Update Prometheus from v2.28.1 to [v2.29.1](https://github.com/prometheus/prometheus/releases/tag/v2.29.1)
|
||||
* Update Grafana from v8.1.1 to [v8.1.2](https://github.com/grafana/grafana/releases/tag/v8.1.2)
|
||||
|
||||
## v1.22.0
|
||||
|
||||
* Kubernetes [v1.22.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md#v1220)
|
||||
* Update etcd from v3.4.16 to [v3.5.0](https://github.com/etcd-io/etcd/releases/tag/v3.5.0)
|
||||
* Switch `kube-controller-manager` and `kube-scheduler` to use secure port only
|
||||
* Update Prometheus config to discover endpoints and use a bearer token to scrape
|
||||
|
||||
### Fedora CoreOS
|
||||
|
||||
* Add Cilium cgroups v2 support on Fedora CoreOS
|
||||
* Update Butane Config version from v1.2.0 to v1.4.0
|
||||
* Rename Fedora CoreOS Config to Butane Config
|
||||
* Require any [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customizations to update to v1.4.0
|
||||
|
||||
### Addons
|
||||
|
||||
* Update nginx-ingress from v0.47.0 to [v1.0.0-beta.1](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.0.0-beta.1)
|
||||
* Update node-exporter from v1.2.0 to [v1.2.2](https://github.com/prometheus/node_exporter/releases/tag/v1.2.2)
|
||||
* Update kube-state-metrics from v2.1.0 to [v2.1.1](https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.1.1)
|
||||
* Update Grafana from v8.0.6 to [v8.1.1](https://github.com/grafana/grafana/releases/tag/v8.1.1)
|
||||
|
||||
## v1.21.3
|
||||
|
||||
* Kubernetes [v1.21.3](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md#v1213)
|
||||
* Update Cilium from v1.10.1 to [v1.10.3](https://github.com/cilium/cilium/releases/tag/v1.10.3)
|
||||
* Require [poseidon/ct](https://github.com/poseidon/terraform-provider-ct) Terraform provider v0.9+ ([notes](https://typhoon.psdn.io/topics/maintenance/#upgrade-terraform-provider-ct))
|
||||
|
||||
### AWS
|
||||
|
||||
* Change default disk type from `gp2` to `gp3` ([#1012](https://github.com/poseidon/typhoon/pull/1012))
|
||||
|
||||
### Addons
|
||||
|
||||
* Update Prometheus from v2.28.0 to [v2.28.1](https://github.com/prometheus/prometheus/releases/tag/v2.28.1)
|
||||
* Update node-exporter from v1.1.2 to [v1.2.0](https://github.com/prometheus/node_exporter/releases/tag/v1.2.0)
|
||||
* Update Grafana from v8.0.3 to [v8.0.6](https://github.com/grafana/grafana/releases/tag/v8.0.6)
|
||||
|
||||
### Known Issues
|
||||
|
||||
* Cilium with recent Fedora CoreOS will have networking issues ([fedora-coreos#881](https://github.com/coreos/fedora-coreos-tracker/issues/881)) (fixed in v1.21.4)
|
||||
|
||||
## v1.21.2
|
||||
|
||||
* Kubernetes [v1.21.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md#v1212)
|
||||
* Add Terraform v1.0.x support ([#974](https://github.com/poseidon/typhoon/pull/974))
|
||||
* Continue to support Terraform v0.13.x, v0.14.4+, and v0.15.x
|
||||
* Update CoreDNS from v1.8.0 to [v1.8.4]([#1006](https://github.com/poseidon/typhoon/pull/1006))
|
||||
* Update Cilium from v1.9.6 to [v1.10.1](https://github.com/cilium/cilium/releases/tag/v1.10.1)
|
||||
* Update Calico from v3.19.0 to [v3.19.1](https://github.com/projectcalico/calico/releases/tag/v3.19.1)
|
||||
|
||||
### Addons
|
||||
|
||||
* Update kube-state-metrics from v2.0.0 to [v2.1.0](https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.1.0)
|
||||
* Update Prometheus from v2.27.0 to [v2.28.0](https://github.com/prometheus/prometheus/releases/tag/v2.28.0)
|
||||
* Update Grafana from v7.5.6 to [v8.0.3](https://github.com/grafana/grafana/releases/tag/v8.0.3)
|
||||
* Update nginx-ingress from v0.46.0 to [v0.47.0](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.47.0)
|
||||
|
||||
### Fedora CoreOS
|
||||
|
||||
#### AWS
|
||||
|
||||
* Extend experimental Fedora CoreOS arm64 support with Cilium
|
||||
* CNI provider may now be `flannel` or `cilium` (new)
|
||||
|
||||
#### Bare-Metal
|
||||
|
||||
* Workaround systemd path unit issue [fedora-coreos-tracker/#861](https://github.com/coreos/fedora-coreos-tracker/issues/861)
|
||||
|
||||
#### DigitalOcean
|
||||
|
||||
* Workaround systemd path unit issue [fedora-coreos-tracker/#861](https://github.com/coreos/fedora-coreos-tracker/issues/861)
|
||||
|
||||
### Known Issues
|
||||
|
||||
* Cilium with recent Fedora CoreOS will have networking issues ([fedora-coreos#881](https://github.com/coreos/fedora-coreos-tracker/issues/881)) (fixed in v1.21.4)
|
||||
|
||||
## v1.21.1
|
||||
|
||||
* Kubernetes [v1.21.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md#v1211)
|
||||
* Add Terraform v0.15.x support ([#974](https://github.com/poseidon/typhoon/pull/974))
|
||||
* Continue to support Terraform v0.13.x and v0.14.4+
|
||||
* Update etcd from v3.4.15 to [v3.4.16](https://github.com/etcd-io/etcd/releases/tag/v3.4.16)
|
||||
* Update Cilium from v1.9.5 to [v1.9.6](https://github.com/cilium/cilium/releases/tag/v1.9.6)
|
||||
* Update Calico from v3.18.1 to [v3.19.0](https://github.com/projectcalico/calico/releases/tag/v3.19.0)
|
||||
|
||||
### AWS
|
||||
|
||||
* Reduce the default `disk_size` from 40GB to 30GB ([#983](https://github.com/poseidon/typhoon/pull/983))
|
||||
|
||||
### Azure
|
||||
|
||||
* Reduce the default `disk_size` from 40GB to 30GB ([#983](https://github.com/poseidon/typhoon/pull/983))
|
||||
|
||||
### Google Cloud
|
||||
|
||||
* Reduce the default `disk_size` from 40GB to 30GB ([#983](https://github.com/poseidon/typhoon/pull/983))
|
||||
|
||||
### Fedora CoreOS
|
||||
|
||||
* Update Kubelet mounts for cgroups v2 ([#978](https://github.com/poseidon/typhoon/pull/978))
|
||||
|
||||
### Addons
|
||||
|
||||
* Update kube-state-metrics from v2.0.0-rc.1 to [v2.0.0](https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0)
|
||||
* Update Prometheus from v2.25.2 to [v2.27.0](https://github.com/prometheus/prometheus/releases/tag/v2.27.0)
|
||||
* Update Grafana from v7.5.3 to [v7.5.6](https://github.com/grafana/grafana/releases/tag/v7.5.6)
|
||||
* Update nginx-ingress from v0.45.0 to [v0.46.0](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.46.0)
|
||||
|
||||
## v1.21.0
|
||||
|
||||
* Kubernetes [v1.21.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md#v1210)
|
||||
* Enable `tokencleaner` controller ([#969](https://github.com/poseidon/typhoon/pull/969))
|
||||
* Enable `kube-scheduler` and `kube-controller-manager` separate authn/z kubeconfig
|
||||
* Change CNI config location from /etc/kubernetes/cni/net.d to /etc/cni/net.d ([#965](https://github.com/poseidon/typhoon/pull/965))
|
||||
* Change `kube-controller-manager` to mount `/var/lib/kubelet/volumeplugins` directly
|
||||
* Remove unused `cloud-provider` flags
|
||||
* Update Fedora CoreOS Config version from v1.1.0 to v1.2.0 ([#970](https://github.com/poseidon/typhoon/pull/970))
|
||||
* Require [poseidon/ct](https://github.com/poseidon/terraform-provider-ct) Terraform provider v0.8+ ([notes](https://typhoon.psdn.io/topics/maintenance/#upgrade-terraform-provider-ct))
|
||||
* Require any [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customizations to update to v1.2.0
|
||||
|
||||
### AWS
|
||||
|
||||
* Allow setting custom initial node taints on worker pools ([#968](https://github.com/poseidon/typhoon/pull/968))
|
||||
* Add `node_taints` variable to internal `workers` pool module to set initial node taints
|
||||
* Add `daemonset_tolerations` so `kube-system` DaemonSets can tolerate custom taints
|
||||
|
||||
### Azure
|
||||
|
||||
* Allow setting custom initial node taints on worker pools ([#968](https://github.com/poseidon/typhoon/pull/968))
|
||||
* Add `node_taints` variable to internal `workers` pool module to set initial node taints
|
||||
* Add `daemonset_tolerations` so `kube-system` DaemonSets can tolerate custom taints
|
||||
* Remove deprecated `azurerm_lb_backend_address_pool` field `resource_group_name` ([#972](https://github.com/poseidon/typhoon/pull/972))
|
||||
|
||||
### Google Cloud
|
||||
|
||||
* Allow setting custom initial node taints on worker pools ([#968](https://github.com/poseidon/typhoon/pull/968))
|
||||
* Add `node_taints` variable to internal `workers` pool module to set initial node taints
|
||||
* Add `daemonset_tolerations` so `kube-system` DaemonSets can tolerate custom taints
|
||||
|
||||
### Addons
|
||||
|
||||
* Update nginx-ingress from v0.44.0 to [v0.45.0](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.45.0)
|
||||
* Update kube-state-metrics from v2.0.0-rc.0 to [v2.0.0-rc.1](https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-rc.1)
|
||||
* Update Grafana from v7.4.5 to [v7.5.3](https://github.com/grafana/grafana/releases/tag/v7.5.3)
|
||||
|
||||
## v1.20.5
|
||||
|
||||
* Kubernetes [v1.20.5](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1205)
|
||||
* Update etcd from v3.4.14 to [v3.4.15](https://github.com/etcd-io/etcd/releases/tag/v3.4.15)
|
||||
* Update Cilium from v1.9.4 to [v1.9.5](https://github.com/cilium/cilium/releases/tag/v1.9.5)
|
||||
* Update Calico from v3.17.3 to [v3.18.1](https://github.com/projectcalico/calico/releases/tag/v3.18.1)
|
||||
* Update CoreDNS from v1.7.0 to [v1.8.0](https://coredns.io/2020/10/22/coredns-1.8.0-release/)
|
||||
* Mark bootstrap token as sensitive in Terraform plans ([#949](https://github.com/poseidon/typhoon/pull/949))
|
||||
|
||||
### Fedora CoreOS
|
||||
|
||||
* Set Kubelet `provider-id` ([#951](https://github.com/poseidon/typhoon/pull/951))
|
||||
|
||||
### Flatcar Linux
|
||||
|
||||
#### AWS
|
||||
|
||||
* Set Kubelet `provider-id` ([#951](https://github.com/poseidon/typhoon/pull/951))
|
||||
* Remove `os_image` option `flatcar-edge` ([#943](https://github.com/poseidon/typhoon/pull/943))
|
||||
|
||||
#### Azure
|
||||
|
||||
* Remove `os_image` option `flatcar-edge` ([#943](https://github.com/poseidon/typhoon/pull/943))
|
||||
|
||||
#### Bare-Metal
|
||||
|
||||
* Remove `os_channel` option `flatcar-edge` ([#943](https://github.com/poseidon/typhoon/pull/943))
|
||||
|
||||
### Addons
|
||||
|
||||
* Update Prometheus from v2.25.0 to [v2.25.2](https://github.com/prometheus/prometheus/releases/tag/v2.25.2)
|
||||
* Update kube-state-metrics from v2.0.0-alpha.3 to [v2.0.0-rc.0](https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-rc.0)
|
||||
* Switch image from `quay.io` to `k8s.gcr.io` ([#946](https://github.com/poseidon/typhoon/pull/946))
|
||||
* Update node-exporter from v1.1.1 to [v1.1.2](https://github.com/prometheus/node_exporter/releases/tag/v1.1.2)
|
||||
* Update Grafana from v7.4.2 to [v7.4.5](https://github.com/grafana/grafana/releases/tag/v7.4.5)
|
||||
|
||||
## v1.20.4
|
||||
|
||||
* Kubernetes [v1.20.4](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1204)
|
||||
* Update Cilium from v1.9.1 to [v1.9.4](https://github.com/cilium/cilium/releases/tag/v1.9.4)
|
||||
* Update Calico from v3.17.1 to [v3.17.3](https://github.com/projectcalico/calico/releases/tag/v3.17.3)
|
||||
* Update flannel-cni from v0.4.1 to [v0.4.2](https://github.com/poseidon/flannel-cni/releases/tag/v0.4.2)
|
||||
|
||||
### Addons
|
||||
|
||||
* Update nginx-ingress from v0.43.0 to [v0.44.0](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.44.0)
|
||||
* Update Prometheus from v2.24.0 to [v2.25.0](https://github.com/prometheus/prometheus/releases/tag/v2.25.0)
|
||||
* Update node-exporter from v1.0.1 to [v1.1.1](https://github.com/prometheus/node_exporter/releases/tag/v1.1.1)
|
||||
* Update Grafana from v7.3.7 to [v7.4.2](https://github.com/grafana/grafana/releases/tag/v7.4.2)
|
||||
|
||||
## v1.20.2
|
||||
|
||||
* Kubernetes [v1.20.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1202)
|
||||
* Support Terraform v0.13.x and v0.14.4+ ([#924](https://github.com/poseidon/typhoon/pull/923))
|
||||
|
||||
### Addons
|
||||
|
||||
* Update nginx-ingress from v0.41.2 to [v0.43.0](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.43.0)
|
||||
* Update Prometheus from v2.23.0 to [v2.24.0](https://github.com/prometheus/prometheus/releases/tag/v2.24.0)
|
||||
* Update Grafana from v7.3.6 to [v7.3.7](https://github.com/grafana/grafana/releases/tag/v7.3.7)
|
||||
|
||||
## v1.20.1
|
||||
|
||||
* Kubernetes [v1.20.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1201)
|
||||
|
||||
### Fedora CoreOS
|
||||
|
||||
* Fedora CoreOS 33 has stronger crypto defaults ([**notice**](https://docs.fedoraproject.org/en-US/fedora-coreos/faq/#_why_does_ssh_stop_working_after_upgrading_to_fedora_33), [#915](https://github.com/poseidon/typhoon/issues/915))
|
||||
* Use a non-RSA SSH key or add the workaround provided in upstream [Fedora docs](https://docs.fedoraproject.org/en-US/fedora-coreos/faq/#_why_does_ssh_stop_working_after_upgrading_to_fedora_33) as a [snippet](https://typhoon.psdn.io/advanced/customization/#fedora-coreos) (**action required**)
|
||||
|
||||
### Addons
|
||||
|
||||
* Update Grafana from v7.3.5 to [v7.3.6](https://github.com/grafana/grafana/releases/tag/v7.3.6)
|
||||
|
||||
## v1.20.0
|
||||
|
||||
* Kubernetes [v1.20.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1200)
|
||||
|
31
README.md
31
README.md
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.20.0 (upstream)
|
||||
* Kubernetes v1.22.1 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
|
||||
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [preemptible](https://typhoon.psdn.io/flatcar-linux/google-cloud/#preemption) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization
|
||||
@ -31,6 +31,10 @@ Typhoon is available for [Fedora CoreOS](https://getfedora.org/coreos/).
|
||||
| DigitalOcean | Fedora CoreOS | [digital-ocean/fedora-coreos/kubernetes](digital-ocean/fedora-coreos/kubernetes) | beta |
|
||||
| Google Cloud | Fedora CoreOS | [google-cloud/fedora-coreos/kubernetes](google-cloud/fedora-coreos/kubernetes) | stable |
|
||||
|
||||
| Platform | Operating System | Terraform Module | Status |
|
||||
|---------------|------------------|------------------|--------|
|
||||
| AWS | Fedora CoreOS (ARM64) | [aws/fedora-coreos/kubernetes](aws/fedora-coreos/kubernetes) | alpha |
|
||||
|
||||
Typhoon is available for [Flatcar Linux](https://www.flatcar-linux.org/releases/).
|
||||
|
||||
| Platform | Operating System | Terraform Module | Status |
|
||||
@ -54,7 +58,7 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo
|
||||
|
||||
```tf
|
||||
module "yavin" {
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.20.0"
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.22.1"
|
||||
|
||||
# Google Cloud
|
||||
cluster_name = "yavin"
|
||||
@ -63,7 +67,7 @@ module "yavin" {
|
||||
dns_zone_name = "example-zone"
|
||||
|
||||
# configuration
|
||||
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
|
||||
ssh_authorized_key = "ssh-ed25519 AAAAB3Nz..."
|
||||
|
||||
# optional
|
||||
worker_count = 2
|
||||
@ -93,9 +97,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
|
||||
$ export KUBECONFIG=/home/user/.kube/configs/yavin-config
|
||||
$ kubectl get nodes
|
||||
NAME ROLES STATUS AGE VERSION
|
||||
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.20.0
|
||||
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.20.0
|
||||
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.20.0
|
||||
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.22.1
|
||||
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.22.1
|
||||
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.22.1
|
||||
```
|
||||
|
||||
List the pods.
|
||||
@ -126,7 +130,7 @@ Typhoon is strict about minimalism, maturity, and scope. These are not in scope:
|
||||
|
||||
## Help
|
||||
|
||||
Ask questions on the IRC #typhoon channel on [freenode.net](http://freenode.net/).
|
||||
Schedule a meeting via [Github Sponsors](https://github.com/sponsors/poseidon?frequency=one-time) to discuss your use case.
|
||||
|
||||
## Motivation
|
||||
|
||||
@ -136,12 +140,17 @@ Typhoon addresses real world needs, which you may share. It is honest about limi
|
||||
|
||||
## Social Contract
|
||||
|
||||
Typhoon is not a product, trial, or free-tier. It is not run by a company, does not offer support or services, and does not accept or make any money. It is not associated with any operating system or platform vendor.
|
||||
Typhoon is not a product, trial, or free-tier. Typhoon does not offer support, services, or charge money. And Typhoon is independent of operating system or platform vendors.
|
||||
|
||||
Typhoon clusters will contain only [free](https://www.debian.org/intro/free) components. Cluster components will not collect data on users without their permission.
|
||||
|
||||
## Donations
|
||||
## Sponsors
|
||||
|
||||
Typhoon does not accept money donations. Instead, we encourage you to donate to one of [these organizations](https://github.com/poseidon/typhoon/wiki/Donations) to show your appreciation.
|
||||
Poseidon's Github [Sponsors](https://github.com/sponsors/poseidon) support the infrastructure and operational costs of providing Typhoon.
|
||||
|
||||
* [DigitalOcean](https://www.digitalocean.com/) kindly provides credits to support Typhoon test clusters.
|
||||
<a href="https://www.digitalocean.com/">
|
||||
<img src="https://opensource.nyc3.cdn.digitaloceanspaces.com/attribution/assets/SVG/DO_Logo_horizontal_blue.svg" width="201px">
|
||||
</a>
|
||||
<br>
|
||||
|
||||
If you'd like your company here, please contact dghubble at psdn.io.
|
||||
|
@ -37,6 +37,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -129,6 +130,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -221,6 +223,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -326,6 +329,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -432,6 +436,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -537,6 +542,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -643,6 +649,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -762,6 +769,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -854,6 +862,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
|
@ -172,7 +172,7 @@ data:
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(kubelet_running_pods{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"})",
|
||||
"expr": "sum(kubelet_running_pods{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"}) OR sum(kubelet_running_pod_count{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{instance}}",
|
||||
@ -256,7 +256,7 @@ data:
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(kubelet_running_containers{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"})",
|
||||
"expr": "sum(kubelet_running_containers{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"}) OR sum(kubelet_running_container_count{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{instance}}",
|
||||
@ -553,6 +553,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -645,6 +646,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -750,6 +752,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -855,6 +858,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -954,6 +958,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1066,6 +1071,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1160,6 +1166,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1267,6 +1274,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1374,6 +1382,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1466,6 +1475,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1572,6 +1582,7 @@ data:
|
||||
"datasource": "$datasource",
|
||||
"description": "Pod lifecycle event generator",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1664,6 +1675,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1769,6 +1781,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1874,6 +1887,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -2000,6 +2014,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -2105,6 +2120,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -2197,6 +2213,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -2289,6 +2306,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -2613,6 +2631,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -2705,6 +2724,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -2810,6 +2830,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -2902,6 +2923,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -3007,6 +3029,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -3120,6 +3143,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -3225,6 +3249,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -3330,6 +3355,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -3422,6 +3448,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -3514,6 +3541,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
|
@ -60,7 +60,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "1 - avg(rate(node_cpu_seconds_total{mode=\"idle\", cluster=\"$cluster\"}[$__interval]))",
|
||||
"expr": "1 - avg(rate(node_cpu_seconds_total{mode=\"idle\", cluster=\"$cluster\"}[$__rate_interval]))",
|
||||
"format": "time_series",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -1586,7 +1586,7 @@ data:
|
||||
],
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)",
|
||||
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -1595,7 +1595,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)",
|
||||
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -1604,7 +1604,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)",
|
||||
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -1613,7 +1613,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)",
|
||||
"expr": "sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -1622,7 +1622,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)",
|
||||
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -1631,7 +1631,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)",
|
||||
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -1731,7 +1731,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)",
|
||||
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{namespace}}",
|
||||
@ -1829,7 +1829,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)",
|
||||
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{namespace}}",
|
||||
@ -1927,7 +1927,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)",
|
||||
"expr": "avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{namespace}}",
|
||||
@ -2025,7 +2025,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)",
|
||||
"expr": "avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{namespace}}",
|
||||
@ -2123,7 +2123,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)",
|
||||
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{namespace}}",
|
||||
@ -2221,7 +2221,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)",
|
||||
"expr": "sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{namespace}}",
|
||||
@ -2319,7 +2319,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)",
|
||||
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{namespace}}",
|
||||
@ -2417,7 +2417,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)",
|
||||
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{namespace}}",
|
||||
@ -4019,7 +4019,7 @@ data:
|
||||
],
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)",
|
||||
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4028,7 +4028,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)",
|
||||
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4037,7 +4037,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)",
|
||||
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4046,7 +4046,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)",
|
||||
"expr": "sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4055,7 +4055,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)",
|
||||
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4064,7 +4064,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)",
|
||||
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4164,7 +4164,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)",
|
||||
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -4262,7 +4262,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)",
|
||||
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -4360,7 +4360,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)",
|
||||
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -4458,7 +4458,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)",
|
||||
"expr": "sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -4556,7 +4556,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)",
|
||||
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -4654,7 +4654,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)",
|
||||
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
|
@ -1058,7 +1058,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(irate(container_network_receive_bytes_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__interval])) by (pod)",
|
||||
"expr": "sum(irate(container_network_receive_bytes_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -1157,7 +1157,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(irate(container_network_transmit_bytes_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__interval])) by (pod)",
|
||||
"expr": "sum(irate(container_network_transmit_bytes_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -1256,7 +1256,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(irate(container_network_receive_packets_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__interval])) by (pod)",
|
||||
"expr": "sum(irate(container_network_receive_packets_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -1355,7 +1355,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(irate(container_network_transmit_packets_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__interval])) by (pod)",
|
||||
"expr": "sum(irate(container_network_transmit_packets_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -1454,7 +1454,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(irate(container_network_receive_packets_dropped_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__interval])) by (pod)",
|
||||
"expr": "sum(irate(container_network_receive_packets_dropped_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -1553,7 +1553,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(irate(container_network_transmit_packets_dropped_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__interval])) by (pod)",
|
||||
"expr": "sum(irate(container_network_transmit_packets_dropped_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -2707,7 +2707,7 @@ data:
|
||||
],
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2716,7 +2716,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2725,7 +2725,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2734,7 +2734,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2743,7 +2743,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2752,7 +2752,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -2852,7 +2852,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -2950,7 +2950,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -3048,7 +3048,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -3146,7 +3146,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -3244,7 +3244,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -3342,7 +3342,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -3440,7 +3440,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -3538,7 +3538,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{pod}}",
|
||||
@ -4902,7 +4902,7 @@ data:
|
||||
],
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4911,7 +4911,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4920,7 +4920,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4929,7 +4929,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4938,7 +4938,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -4947,7 +4947,7 @@ data:
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
@ -5047,7 +5047,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{workload}}",
|
||||
@ -5145,7 +5145,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{workload}}",
|
||||
@ -5243,7 +5243,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{workload}}",
|
||||
@ -5341,7 +5341,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{workload}}",
|
||||
@ -5439,7 +5439,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{workload}}",
|
||||
@ -5537,7 +5537,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{workload}}",
|
||||
@ -5635,7 +5635,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{workload}}",
|
||||
@ -5733,7 +5733,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{workload}}",
|
||||
|
@ -140,8 +140,9 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"decimals": 3,
|
||||
"description": "How much error budget is left looking at our 0.990% availability gurantees?",
|
||||
"description": "How much error budget is left looking at our 0.990% availability guarantees?",
|
||||
"fill": 10,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -336,6 +337,7 @@ data:
|
||||
"datasource": "$datasource",
|
||||
"description": "How many read requests (LIST,GET) per second do the apiservers get by code?",
|
||||
"fill": 10,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -444,6 +446,7 @@ data:
|
||||
"datasource": "$datasource",
|
||||
"description": "How many percent of read requests (LIST,GET) per second are returned with errors (5xx)?",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -537,6 +540,7 @@ data:
|
||||
"datasource": "$datasource",
|
||||
"description": "How many seconds is the 99th percentile for reading (LIST|GET) a given resource?",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -729,6 +733,7 @@ data:
|
||||
"datasource": "$datasource",
|
||||
"description": "How many write requests (POST|PUT|PATCH|DELETE) per second do the apiservers get by code?",
|
||||
"fill": 10,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -837,6 +842,7 @@ data:
|
||||
"datasource": "$datasource",
|
||||
"description": "How many percent of write requests (POST|PUT|PATCH|DELETE) per second are returned with errors (5xx)?",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -930,6 +936,7 @@ data:
|
||||
"datasource": "$datasource",
|
||||
"description": "How many seconds is the 99th percentile for writing (POST|PUT|PATCH|DELETE) a given resource?",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1035,6 +1042,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1127,6 +1135,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1219,6 +1228,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1324,6 +1334,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1416,6 +1427,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1508,6 +1520,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1832,6 +1845,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1937,6 +1951,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -2042,6 +2057,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -2147,6 +2163,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -2260,6 +2277,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -2365,6 +2383,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -2470,6 +2489,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -2562,6 +2582,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -2654,6 +2675,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -2868,6 +2890,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -3019,7 +3042,7 @@ data:
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(\n kubelet_volume_stats_capacity_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n -\n kubelet_volume_stats_available_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n)\n/\nkubelet_volume_stats_capacity_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n* 100\n",
|
||||
"expr": "max without(instance,node) (\n(\n kubelet_volume_stats_capacity_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n -\n kubelet_volume_stats_available_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n)\n/\nkubelet_volume_stats_capacity_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n* 100)\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
@ -3064,6 +3087,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -3215,7 +3239,7 @@ data:
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "kubelet_volume_stats_inodes_used{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n/\nkubelet_volume_stats_inodes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n* 100\n",
|
||||
"expr": "max without(instance,node) (\nkubelet_volume_stats_inodes_used{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n/\nkubelet_volume_stats_inodes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n* 100)\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
@ -3505,6 +3529,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -3618,6 +3643,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -3744,6 +3770,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -3857,6 +3884,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -3962,6 +3990,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -4067,6 +4096,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -4159,6 +4189,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -4251,6 +4282,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -4516,7 +4548,7 @@ data:
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(rate(container_cpu_usage_seconds_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$statefulset.*\"}[3m]))",
|
||||
"expr": "sum(rate(container_cpu_usage_seconds_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", container!=\"\", namespace=\"$namespace\", pod=~\"$statefulset.*\"}[3m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
@ -4599,7 +4631,7 @@ data:
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(container_memory_usage_bytes{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$statefulset.*\"}) / 1024^3",
|
||||
"expr": "sum(container_memory_usage_bytes{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", container!=\"\", namespace=\"$namespace\", pod=~\"$statefulset.*\"}) / 1024^3",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
@ -4682,7 +4714,7 @@ data:
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(rate(container_network_transmit_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$statefulset.*\"}[3m])) + sum(rate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=\"$namespace\",pod=~\"$statefulset.*\"}[3m]))",
|
||||
"expr": "sum(rate(container_network_transmit_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$statefulset.*\"}[3m])) + sum(rate(container_network_receive_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\",pod=~\"$statefulset.*\"}[3m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
@ -5077,6 +5109,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
|
@ -172,7 +172,7 @@ data:
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(avg_over_time(nginx_ingress_controller_nginx_process_connections{cluster=~\"$cluster\", controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\"}[2m]))",
|
||||
"expr": "sum(avg_over_time(nginx_ingress_controller_nginx_process_connections{cluster=~\"$cluster\", controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\",state=\"active\"}[2m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
@ -296,6 +296,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -388,6 +389,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -493,6 +495,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -612,6 +615,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -711,6 +715,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -803,6 +808,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
|
@ -36,6 +36,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -129,6 +130,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 0,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -255,6 +257,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -420,7 +423,7 @@ data:
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "100 -\n(\n node_memory_MemAvailable_bytes{job=\"node-exporter\", instance=\"$instance\"}\n/\n node_memory_MemTotal_bytes{job=\"node-exporter\", instance=\"$instance\"}\n* 100\n)\n",
|
||||
"expr": "100 -\n(\n avg(node_memory_MemAvailable_bytes{job=\"node-exporter\", instance=\"$instance\"})\n/\n avg(node_memory_MemTotal_bytes{job=\"node-exporter\", instance=\"$instance\"})\n* 100\n)\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
@ -462,6 +465,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 0,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -578,6 +582,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -697,6 +702,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 0,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -790,6 +796,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 0,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
|
@ -21,7 +21,7 @@ data:
|
||||
"links": [
|
||||
|
||||
],
|
||||
"refresh": "",
|
||||
"refresh": "60s",
|
||||
"rows": [
|
||||
{
|
||||
"collapse": false,
|
||||
@ -36,6 +36,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -72,7 +73,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(\n prometheus_remote_storage_highest_timestamp_in_seconds{cluster=~\"$cluster\", instance=~\"$instance\"} \n- \n ignoring(remote_name, url) group_right(instance) prometheus_remote_storage_queue_highest_sent_timestamp_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}\n)\n",
|
||||
"expr": "(\n prometheus_remote_storage_highest_timestamp_in_seconds{cluster=~\"$cluster\", instance=~\"$instance\"} \n- \n ignoring(remote_name, url) group_right(instance) (prometheus_remote_storage_queue_highest_sent_timestamp_seconds{cluster=~\"$cluster\", instance=~\"$instance\"} != 0)\n)\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
|
||||
@ -128,6 +129,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -164,7 +166,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(\n rate(prometheus_remote_storage_highest_timestamp_in_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) \n- \n ignoring (remote_name, url) group_right(instance) rate(prometheus_remote_storage_queue_highest_sent_timestamp_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n)\n",
|
||||
"expr": "clamp_min(\n rate(prometheus_remote_storage_highest_timestamp_in_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) \n- \n ignoring (remote_name, url) group_right(instance) rate(prometheus_remote_storage_queue_highest_sent_timestamp_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n, 0)\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
|
||||
@ -233,6 +235,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -269,7 +272,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "rate(\n prometheus_remote_storage_samples_in_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n- \n ignoring(remote_name, url) group_right(instance) rate(prometheus_remote_storage_succeeded_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n- \n rate(prometheus_remote_storage_dropped_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n",
|
||||
"expr": "rate(\n prometheus_remote_storage_samples_in_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n- \n ignoring(remote_name, url) group_right(instance) (rate(prometheus_remote_storage_succeeded_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) or rate(prometheus_remote_storage_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]))\n- \n (rate(prometheus_remote_storage_dropped_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) or rate(prometheus_remote_storage_samples_dropped_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]))\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
|
||||
@ -338,6 +341,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -431,6 +435,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -523,6 +528,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -615,6 +621,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -720,6 +727,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -812,6 +820,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -848,7 +857,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "prometheus_remote_storage_pending_samples{cluster=~\"$cluster\", instance=~\"$instance\"}",
|
||||
"expr": "prometheus_remote_storage_pending_samples{cluster=~\"$cluster\", instance=~\"$instance\"} or prometheus_remote_storage_samples_pending{cluster=~\"$cluster\", instance=~\"$instance\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
|
||||
@ -917,6 +926,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1009,6 +1019,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1114,6 +1125,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1150,7 +1162,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "rate(prometheus_remote_storage_dropped_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])",
|
||||
"expr": "rate(prometheus_remote_storage_dropped_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) or rate(prometheus_remote_storage_samples_dropped_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
|
||||
@ -1206,6 +1218,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1242,7 +1255,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "rate(prometheus_remote_storage_failed_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])",
|
||||
"expr": "rate(prometheus_remote_storage_failed_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) or rate(prometheus_remote_storage_samples_failed_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
|
||||
@ -1298,6 +1311,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1334,7 +1348,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "rate(prometheus_remote_storage_retried_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])",
|
||||
"expr": "rate(prometheus_remote_storage_retried_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) or rate(prometheus_remote_storage_samples_retried_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
|
||||
@ -1390,6 +1404,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
@ -1486,7 +1501,7 @@ data:
|
||||
"schemaVersion": 14,
|
||||
"style": "dark",
|
||||
"tags": [
|
||||
|
||||
"prometheus-mixin"
|
||||
],
|
||||
"templating": {
|
||||
"list": [
|
||||
@ -1630,7 +1645,7 @@ data:
|
||||
]
|
||||
},
|
||||
"timezone": "browser",
|
||||
"title": "Prometheus Remote Write",
|
||||
"title": "Prometheus / Remote Write",
|
||||
"version": 0
|
||||
}
|
||||
prometheus.json: |-
|
||||
@ -1647,7 +1662,7 @@ data:
|
||||
"links": [
|
||||
|
||||
],
|
||||
"refresh": "10s",
|
||||
"refresh": "60s",
|
||||
"rows": [
|
||||
{
|
||||
"collapse": false,
|
||||
@ -2726,7 +2741,7 @@ data:
|
||||
"schemaVersion": 14,
|
||||
"style": "dark",
|
||||
"tags": [
|
||||
|
||||
"prometheus-mixin"
|
||||
],
|
||||
"templating": {
|
||||
"list": [
|
||||
@ -2834,7 +2849,7 @@ data:
|
||||
]
|
||||
},
|
||||
"timezone": "utc",
|
||||
"title": "Prometheus Overview",
|
||||
"title": "Prometheus / Overview",
|
||||
"uid": "",
|
||||
"version": 0
|
||||
}
|
||||
|
@ -24,7 +24,7 @@ spec:
|
||||
type: RuntimeDefault
|
||||
containers:
|
||||
- name: grafana
|
||||
image: docker.io/grafana/grafana:7.3.5
|
||||
image: docker.io/grafana/grafana:8.1.2
|
||||
env:
|
||||
- name: GF_PATHS_CONFIG
|
||||
value: "/etc/grafana/custom.ini"
|
||||
|
@ -23,7 +23,7 @@ spec:
|
||||
type: RuntimeDefault
|
||||
containers:
|
||||
- name: nginx-ingress-controller
|
||||
image: k8s.gcr.io/ingress-nginx/controller:v0.41.2
|
||||
image: k8s.gcr.io/ingress-nginx/controller:v1.0.0
|
||||
args:
|
||||
- /nginx-ingress-controller
|
||||
- --ingress-class=public
|
||||
|
@ -23,7 +23,7 @@ spec:
|
||||
type: RuntimeDefault
|
||||
containers:
|
||||
- name: nginx-ingress-controller
|
||||
image: k8s.gcr.io/ingress-nginx/controller:v0.41.2
|
||||
image: k8s.gcr.io/ingress-nginx/controller:v1.0.0
|
||||
args:
|
||||
- /nginx-ingress-controller
|
||||
- --ingress-class=public
|
||||
|
@ -23,7 +23,7 @@ spec:
|
||||
type: RuntimeDefault
|
||||
containers:
|
||||
- name: nginx-ingress-controller
|
||||
image: k8s.gcr.io/ingress-nginx/controller:v0.41.2
|
||||
image: k8s.gcr.io/ingress-nginx/controller:v1.0.0
|
||||
args:
|
||||
- /nginx-ingress-controller
|
||||
- --ingress-class=public
|
||||
|
@ -23,7 +23,7 @@ spec:
|
||||
type: RuntimeDefault
|
||||
containers:
|
||||
- name: nginx-ingress-controller
|
||||
image: k8s.gcr.io/ingress-nginx/controller:v0.41.2
|
||||
image: k8s.gcr.io/ingress-nginx/controller:v1.0.0
|
||||
args:
|
||||
- /nginx-ingress-controller
|
||||
- --ingress-class=public
|
||||
|
@ -23,7 +23,7 @@ spec:
|
||||
type: RuntimeDefault
|
||||
containers:
|
||||
- name: nginx-ingress-controller
|
||||
image: k8s.gcr.io/ingress-nginx/controller:v0.41.2
|
||||
image: k8s.gcr.io/ingress-nginx/controller:v1.0.0
|
||||
args:
|
||||
- /nginx-ingress-controller
|
||||
- --ingress-class=public
|
||||
|
@ -72,6 +72,48 @@ data:
|
||||
regex: apiserver_request_duration_seconds_count;.+
|
||||
action: drop
|
||||
|
||||
# Scrape config for kube-controller-manager endpoints.
|
||||
#
|
||||
# kube-controller-manager service endpoints can be discovered by using the
|
||||
# `endpoints` role and relabelling to only keep only endpoints associated with
|
||||
# kube-system/kube-controller-manager and the `https` port.
|
||||
- job_name: 'kube-controller-manager'
|
||||
kubernetes_sd_configs:
|
||||
- role: endpoints
|
||||
scheme: https
|
||||
tls_config:
|
||||
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
|
||||
insecure_skip_verify: true
|
||||
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
|
||||
relabel_configs:
|
||||
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
|
||||
action: keep
|
||||
regex: kube-system;kube-controller-manager;metrics
|
||||
- replacement: kube-controller-manager
|
||||
action: replace
|
||||
target_label: job
|
||||
|
||||
# Scrape config for kube-scheduler endpoints.
|
||||
#
|
||||
# kube-scheduler service endpoints can be discovered by using the `endpoints`
|
||||
# role and relabelling to only keep only endpoints associated with
|
||||
# kube-system/kube-scheduler and the `https` port.
|
||||
- job_name: 'kube-scheduler'
|
||||
kubernetes_sd_configs:
|
||||
- role: endpoints
|
||||
scheme: https
|
||||
tls_config:
|
||||
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
|
||||
insecure_skip_verify: true
|
||||
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
|
||||
relabel_configs:
|
||||
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
|
||||
action: keep
|
||||
regex: kube-system;kube-scheduler;metrics
|
||||
- replacement: kube-scheduler
|
||||
action: replace
|
||||
target_label: job
|
||||
|
||||
# Scrape config for node (i.e. kubelet) /metrics (e.g. 'kubelet_'). Explore
|
||||
# metrics from a node by scraping kubelet (127.0.0.1:10250/metrics).
|
||||
- job_name: 'kubelet'
|
||||
|
@ -21,7 +21,7 @@ spec:
|
||||
serviceAccountName: prometheus
|
||||
containers:
|
||||
- name: prometheus
|
||||
image: quay.io/prometheus/prometheus:v2.23.0
|
||||
image: quay.io/prometheus/prometheus:v2.29.1
|
||||
args:
|
||||
- --web.listen-address=0.0.0.0:9090
|
||||
- --config.file=/etc/prometheus/prometheus.yaml
|
||||
|
@ -1,11 +1,9 @@
|
||||
# Allow Prometheus to scrape service endpoints
|
||||
# Allow Prometheus to discover service endpoints
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: kube-controller-manager
|
||||
namespace: kube-system
|
||||
annotations:
|
||||
prometheus.io/scrape: 'true'
|
||||
spec:
|
||||
type: ClusterIP
|
||||
clusterIP: None
|
||||
@ -14,5 +12,5 @@ spec:
|
||||
ports:
|
||||
- name: metrics
|
||||
protocol: TCP
|
||||
port: 10252
|
||||
targetPort: 10252
|
||||
port: 10257
|
||||
targetPort: 10257
|
||||
|
@ -1,11 +1,9 @@
|
||||
# Allow Prometheus to scrape service endpoints
|
||||
# Allow Prometheus to discover service endpoints
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: kube-scheduler
|
||||
namespace: kube-system
|
||||
annotations:
|
||||
prometheus.io/scrape: 'true'
|
||||
spec:
|
||||
type: ClusterIP
|
||||
clusterIP: None
|
||||
@ -14,5 +12,5 @@ spec:
|
||||
ports:
|
||||
- name: metrics
|
||||
protocol: TCP
|
||||
port: 10251
|
||||
targetPort: 10251
|
||||
port: 10259
|
||||
targetPort: 10259
|
||||
|
@ -25,7 +25,7 @@ spec:
|
||||
serviceAccountName: kube-state-metrics
|
||||
containers:
|
||||
- name: kube-state-metrics
|
||||
image: quay.io/coreos/kube-state-metrics:v2.0.0-alpha.3
|
||||
image: k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.1.1
|
||||
ports:
|
||||
- name: metrics
|
||||
containerPort: 8080
|
||||
|
@ -28,13 +28,13 @@ spec:
|
||||
hostPID: true
|
||||
containers:
|
||||
- name: node-exporter
|
||||
image: quay.io/prometheus/node-exporter:v1.0.1
|
||||
image: quay.io/prometheus/node-exporter:v1.2.2
|
||||
args:
|
||||
- --path.procfs=/host/proc
|
||||
- --path.sysfs=/host/sys
|
||||
- --path.rootfs=/host/root
|
||||
- --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/)
|
||||
- --collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$
|
||||
- --collector.filesystem.mount-points-exclude=^/(dev|proc|sys|var/lib/docker/.+)($|/)
|
||||
- --collector.filesystem.fs-types-exclude=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$
|
||||
ports:
|
||||
- name: metrics
|
||||
containerPort: 9100
|
||||
|
@ -9,7 +9,8 @@ data:
|
||||
{
|
||||
"alert": "etcdMembersDown",
|
||||
"annotations": {
|
||||
"message": "etcd cluster \"{{ $labels.job }}\": members are down ({{ $value }})."
|
||||
"description": "etcd cluster \"{{ $labels.job }}\": members are down ({{ $value }}).",
|
||||
"summary": "etcd cluster members are down."
|
||||
},
|
||||
"expr": "max without (endpoint) (\n sum without (instance) (up{job=~\".*etcd.*\"} == bool 0)\nor\n count without (To) (\n sum without (instance) (rate(etcd_network_peer_sent_failures_total{job=~\".*etcd.*\"}[120s])) > 0.01\n )\n)\n> 0\n",
|
||||
"for": "10m",
|
||||
@ -20,7 +21,8 @@ data:
|
||||
{
|
||||
"alert": "etcdInsufficientMembers",
|
||||
"annotations": {
|
||||
"message": "etcd cluster \"{{ $labels.job }}\": insufficient members ({{ $value }})."
|
||||
"description": "etcd cluster \"{{ $labels.job }}\": insufficient members ({{ $value }}).",
|
||||
"summary": "etcd cluster has insufficient number of members."
|
||||
},
|
||||
"expr": "sum(up{job=~\".*etcd.*\"} == bool 1) without (instance) < ((count(up{job=~\".*etcd.*\"}) without (instance) + 1) / 2)\n",
|
||||
"for": "3m",
|
||||
@ -31,7 +33,8 @@ data:
|
||||
{
|
||||
"alert": "etcdNoLeader",
|
||||
"annotations": {
|
||||
"message": "etcd cluster \"{{ $labels.job }}\": member {{ $labels.instance }} has no leader."
|
||||
"description": "etcd cluster \"{{ $labels.job }}\": member {{ $labels.instance }} has no leader.",
|
||||
"summary": "etcd cluster has no leader."
|
||||
},
|
||||
"expr": "etcd_server_has_leader{job=~\".*etcd.*\"} == 0\n",
|
||||
"for": "1m",
|
||||
@ -42,7 +45,8 @@ data:
|
||||
{
|
||||
"alert": "etcdHighNumberOfLeaderChanges",
|
||||
"annotations": {
|
||||
"message": "etcd cluster \"{{ $labels.job }}\": {{ $value }} leader changes within the last 15 minutes. Frequent elections may be a sign of insufficient resources, high network latency, or disruptions by other components and should be investigated."
|
||||
"description": "etcd cluster \"{{ $labels.job }}\": {{ $value }} leader changes within the last 15 minutes. Frequent elections may be a sign of insufficient resources, high network latency, or disruptions by other components and should be investigated.",
|
||||
"summary": "etcd cluster has high number of leader changes."
|
||||
},
|
||||
"expr": "increase((max without (instance) (etcd_server_leader_changes_seen_total{job=~\".*etcd.*\"}) or 0*absent(etcd_server_leader_changes_seen_total{job=~\".*etcd.*\"}))[15m:1m]) >= 4\n",
|
||||
"for": "5m",
|
||||
@ -53,7 +57,8 @@ data:
|
||||
{
|
||||
"alert": "etcdGRPCRequestsSlow",
|
||||
"annotations": {
|
||||
"message": "etcd cluster \"{{ $labels.job }}\": gRPC requests to {{ $labels.grpc_method }} are taking {{ $value }}s on etcd instance {{ $labels.instance }}."
|
||||
"description": "etcd cluster \"{{ $labels.job }}\": gRPC requests to {{ $labels.grpc_method }} are taking {{ $value }}s on etcd instance {{ $labels.instance }}.",
|
||||
"summary": "etcd grpc requests are slow"
|
||||
},
|
||||
"expr": "histogram_quantile(0.99, sum(rate(grpc_server_handling_seconds_bucket{job=~\".*etcd.*\", grpc_type=\"unary\"}[5m])) without(grpc_type))\n> 0.15\n",
|
||||
"for": "10m",
|
||||
@ -64,7 +69,8 @@ data:
|
||||
{
|
||||
"alert": "etcdMemberCommunicationSlow",
|
||||
"annotations": {
|
||||
"message": "etcd cluster \"{{ $labels.job }}\": member communication with {{ $labels.To }} is taking {{ $value }}s on etcd instance {{ $labels.instance }}."
|
||||
"description": "etcd cluster \"{{ $labels.job }}\": member communication with {{ $labels.To }} is taking {{ $value }}s on etcd instance {{ $labels.instance }}.",
|
||||
"summary": "etcd cluster member communication is slow."
|
||||
},
|
||||
"expr": "histogram_quantile(0.99, rate(etcd_network_peer_round_trip_time_seconds_bucket{job=~\".*etcd.*\"}[5m]))\n> 0.15\n",
|
||||
"for": "10m",
|
||||
@ -75,7 +81,8 @@ data:
|
||||
{
|
||||
"alert": "etcdHighNumberOfFailedProposals",
|
||||
"annotations": {
|
||||
"message": "etcd cluster \"{{ $labels.job }}\": {{ $value }} proposal failures within the last 30 minutes on etcd instance {{ $labels.instance }}."
|
||||
"description": "etcd cluster \"{{ $labels.job }}\": {{ $value }} proposal failures within the last 30 minutes on etcd instance {{ $labels.instance }}.",
|
||||
"summary": "etcd cluster has high number of proposal failures."
|
||||
},
|
||||
"expr": "rate(etcd_server_proposals_failed_total{job=~\".*etcd.*\"}[15m]) > 5\n",
|
||||
"for": "15m",
|
||||
@ -86,7 +93,8 @@ data:
|
||||
{
|
||||
"alert": "etcdHighFsyncDurations",
|
||||
"annotations": {
|
||||
"message": "etcd cluster \"{{ $labels.job }}\": 99th percentile fync durations are {{ $value }}s on etcd instance {{ $labels.instance }}."
|
||||
"description": "etcd cluster \"{{ $labels.job }}\": 99th percentile fsync durations are {{ $value }}s on etcd instance {{ $labels.instance }}.",
|
||||
"summary": "etcd cluster 99th percentile fsync durations are too high."
|
||||
},
|
||||
"expr": "histogram_quantile(0.99, rate(etcd_disk_wal_fsync_duration_seconds_bucket{job=~\".*etcd.*\"}[5m]))\n> 0.5\n",
|
||||
"for": "10m",
|
||||
@ -94,10 +102,22 @@ data:
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "etcdHighFsyncDurations",
|
||||
"annotations": {
|
||||
"message": "etcd cluster \"{{ $labels.job }}\": 99th percentile fsync durations are {{ $value }}s on etcd instance {{ $labels.instance }}."
|
||||
},
|
||||
"expr": "histogram_quantile(0.99, rate(etcd_disk_wal_fsync_duration_seconds_bucket{job=~\".*etcd.*\"}[5m]))\n> 1\n",
|
||||
"for": "10m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "etcdHighCommitDurations",
|
||||
"annotations": {
|
||||
"message": "etcd cluster \"{{ $labels.job }}\": 99th percentile commit durations {{ $value }}s on etcd instance {{ $labels.instance }}."
|
||||
"description": "etcd cluster \"{{ $labels.job }}\": 99th percentile commit durations {{ $value }}s on etcd instance {{ $labels.instance }}.",
|
||||
"summary": "etcd cluster 99th percentile commit durations are too high."
|
||||
},
|
||||
"expr": "histogram_quantile(0.99, rate(etcd_disk_backend_commit_duration_seconds_bucket{job=~\".*etcd.*\"}[5m]))\n> 0.25\n",
|
||||
"for": "10m",
|
||||
@ -108,7 +128,8 @@ data:
|
||||
{
|
||||
"alert": "etcdHighNumberOfFailedHTTPRequests",
|
||||
"annotations": {
|
||||
"message": "{{ $value }}% of requests for {{ $labels.method }} failed on etcd instance {{ $labels.instance }}"
|
||||
"description": "{{ $value }}% of requests for {{ $labels.method }} failed on etcd instance {{ $labels.instance }}",
|
||||
"summary": "etcd has high number of failed HTTP requests."
|
||||
},
|
||||
"expr": "sum(rate(etcd_http_failed_total{job=~\".*etcd.*\", code!=\"404\"}[5m])) without (code) / sum(rate(etcd_http_received_total{job=~\".*etcd.*\"}[5m]))\nwithout (code) > 0.01\n",
|
||||
"for": "10m",
|
||||
@ -119,7 +140,8 @@ data:
|
||||
{
|
||||
"alert": "etcdHighNumberOfFailedHTTPRequests",
|
||||
"annotations": {
|
||||
"message": "{{ $value }}% of requests for {{ $labels.method }} failed on etcd instance {{ $labels.instance }}."
|
||||
"description": "{{ $value }}% of requests for {{ $labels.method }} failed on etcd instance {{ $labels.instance }}.",
|
||||
"summary": "etcd has high number of failed HTTP requests."
|
||||
},
|
||||
"expr": "sum(rate(etcd_http_failed_total{job=~\".*etcd.*\", code!=\"404\"}[5m])) without (code) / sum(rate(etcd_http_received_total{job=~\".*etcd.*\"}[5m]))\nwithout (code) > 0.05\n",
|
||||
"for": "10m",
|
||||
@ -130,13 +152,36 @@ data:
|
||||
{
|
||||
"alert": "etcdHTTPRequestsSlow",
|
||||
"annotations": {
|
||||
"message": "etcd instance {{ $labels.instance }} HTTP requests to {{ $labels.method }} are slow."
|
||||
"description": "etcd instance {{ $labels.instance }} HTTP requests to {{ $labels.method }} are slow.",
|
||||
"summary": "etcd instance HTTP requests are slow."
|
||||
},
|
||||
"expr": "histogram_quantile(0.99, rate(etcd_http_successful_duration_seconds_bucket[5m]))\n> 0.15\n",
|
||||
"for": "10m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "etcdBackendQuotaLowSpace",
|
||||
"annotations": {
|
||||
"message": "etcd cluster \"{{ $labels.job }}\": database size exceeds the defined quota on etcd instance {{ $labels.instance }}, please defrag or increase the quota as the writes to etcd will be disabled when it is full."
|
||||
},
|
||||
"expr": "(etcd_mvcc_db_total_size_in_bytes/etcd_server_quota_backend_bytes)*100 > 95\n",
|
||||
"for": "10m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "etcdExcessiveDatabaseGrowth",
|
||||
"annotations": {
|
||||
"message": "etcd cluster \"{{ $labels.job }}\": Observed surge in etcd writes leading to 50% increase in database size over the past four hours on etcd instance {{ $labels.instance }}, please check as it might be disruptive."
|
||||
},
|
||||
"expr": "increase(((etcd_mvcc_db_total_size_in_bytes/etcd_server_quota_backend_bytes)*100)[240m:1m]) > 50\n",
|
||||
"for": "10m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
@ -276,10 +321,6 @@ data:
|
||||
},
|
||||
"record": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile"
|
||||
},
|
||||
{
|
||||
"expr": "sum(rate(apiserver_request_duration_seconds_sum{subresource!=\"log\",verb!~\"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT\"}[5m])) without(instance, pod)\n/\nsum(rate(apiserver_request_duration_seconds_count{subresource!=\"log\",verb!~\"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT\"}[5m])) without(instance, pod)\n",
|
||||
"record": "cluster:apiserver_request_duration_seconds:mean5m"
|
||||
},
|
||||
{
|
||||
"expr": "histogram_quantile(0.99, sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",subresource!=\"log\",verb!~\"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT\"}[5m])) without(instance, pod))\n",
|
||||
"labels": {
|
||||
@ -443,10 +484,6 @@ data:
|
||||
{
|
||||
"name": "k8s.rules",
|
||||
"rules": [
|
||||
{
|
||||
"expr": "sum(rate(container_cpu_usage_seconds_total{job=\"kubernetes-cadvisor\", image!=\"\", container!=\"POD\"}[5m])) by (namespace)\n",
|
||||
"record": "namespace:container_cpu_usage_seconds_total:sum_rate"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (cluster, namespace, pod, container) (\n rate(container_cpu_usage_seconds_total{job=\"kubernetes-cadvisor\", image!=\"\", container!=\"POD\"}[5m])\n) * on (cluster, namespace, pod) group_left(node) topk by (cluster, namespace, pod) (\n 1, max by(cluster, namespace, pod, node) (kube_pod_info{node!=\"\"})\n)\n",
|
||||
"record": "node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate"
|
||||
@ -467,10 +504,6 @@ data:
|
||||
"expr": "container_memory_swap{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) topk by(namespace, pod) (1,\n max by(namespace, pod, node) (kube_pod_info{node!=\"\"})\n)\n",
|
||||
"record": "node_namespace_pod_container:container_memory_swap"
|
||||
},
|
||||
{
|
||||
"expr": "sum(container_memory_usage_bytes{job=\"kubernetes-cadvisor\", image!=\"\", container!=\"POD\"}) by (namespace)\n",
|
||||
"record": "namespace:container_memory_usage_bytes:sum"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (namespace) (\n sum by (namespace, pod) (\n max by (namespace, pod, container) (\n kube_pod_container_resource_requests_memory_bytes{job=\"kube-state-metrics\"}\n ) * on(namespace, pod) group_left() max by (namespace, pod) (\n kube_pod_status_phase{phase=~\"Pending|Running\"} == 1\n )\n )\n)\n",
|
||||
"record": "namespace:kube_pod_container_resource_requests_memory_bytes:sum"
|
||||
@ -573,10 +606,6 @@ data:
|
||||
{
|
||||
"name": "node.rules",
|
||||
"rules": [
|
||||
{
|
||||
"expr": "sum(min(kube_pod_info{node!=\"\"}) by (cluster, node))\n",
|
||||
"record": ":kube_pod_info_node_count:"
|
||||
},
|
||||
{
|
||||
"expr": "topk by(namespace, pod) (1,\n max by (node, namespace, pod) (\n label_replace(kube_pod_info{job=\"kube-state-metrics\",node!=\"\"}, \"pod\", \"$1\", \"pod\", \"(.*)\")\n))\n",
|
||||
"record": "node_namespace_pod:kube_pod_info:"
|
||||
@ -779,7 +808,7 @@ data:
|
||||
{
|
||||
"alert": "KubeJobFailed",
|
||||
"annotations": {
|
||||
"description": "Job {{ $labels.namespace }}/{{ $labels.job_name }} failed to complete.",
|
||||
"description": "Job {{ $labels.namespace }}/{{ $labels.job_name }} failed to complete. Removing failed job after investigation should clear this alert.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubejobfailed",
|
||||
"summary": "Job failed to complete."
|
||||
},
|
||||
@ -796,7 +825,7 @@ data:
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubehpareplicasmismatch",
|
||||
"summary": "HPA has not matched descired number of replicas."
|
||||
},
|
||||
"expr": "(kube_hpa_status_desired_replicas{job=\"kube-state-metrics\"}\n !=\nkube_hpa_status_current_replicas{job=\"kube-state-metrics\"})\n and\nchanges(kube_hpa_status_current_replicas[15m]) == 0\n",
|
||||
"expr": "(kube_hpa_status_desired_replicas{job=\"kube-state-metrics\"}\n !=\nkube_hpa_status_current_replicas{job=\"kube-state-metrics\"})\n and\n(kube_hpa_status_current_replicas{job=\"kube-state-metrics\"}\n >\nkube_hpa_spec_min_replicas{job=\"kube-state-metrics\"})\n and\n(kube_hpa_status_current_replicas{job=\"kube-state-metrics\"}\n <\nkube_hpa_spec_max_replicas{job=\"kube-state-metrics\"})\n and\nchanges(kube_hpa_status_current_replicas[15m]) == 0\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
@ -866,7 +895,7 @@ data:
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubememoryquotaovercommit",
|
||||
"summary": "Cluster has overcommitted memory resource requests."
|
||||
},
|
||||
"expr": "sum(kube_resourcequota{job=\"kube-state-metrics\", type=\"hard\", resource=\"memory\"})\n /\nsum(kube_node_status_allocatable_memory_bytes{job=\"node-exporter\"})\n > 1.5\n",
|
||||
"expr": "sum(kube_resourcequota{job=\"kube-state-metrics\", type=\"hard\", resource=\"memory\"})\n /\nsum(kube_node_status_allocatable_memory_bytes{job=\"kube-state-metrics\"})\n > 1.5\n",
|
||||
"for": "5m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
@ -1096,11 +1125,11 @@ data:
|
||||
{
|
||||
"alert": "AggregatedAPIErrors",
|
||||
"annotations": {
|
||||
"description": "An aggregated API {{ $labels.name }}/{{ $labels.namespace }} has reported errors. The number of errors have increased for it in the past five minutes. High values indicate that the availability of the service changes too often.",
|
||||
"description": "An aggregated API {{ $labels.name }}/{{ $labels.namespace }} has reported errors. It has appeared unavailable {{ $value | humanize }} times averaged over the past 10m.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-aggregatedapierrors",
|
||||
"summary": "An aggregated API has reported errors."
|
||||
},
|
||||
"expr": "sum by(name, namespace)(increase(aggregator_unavailable_apiservice_count[5m])) > 2\n",
|
||||
"expr": "sum by(name, namespace)(increase(aggregator_unavailable_apiservice_count[10m])) > 4\n",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
@ -1341,115 +1370,6 @@ data:
|
||||
}
|
||||
]
|
||||
}
|
||||
loki.yaml: |-
|
||||
{
|
||||
"groups": [
|
||||
{
|
||||
"name": "loki_rules",
|
||||
"rules": [
|
||||
{
|
||||
"expr": "histogram_quantile(0.99, sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, job))",
|
||||
"record": "job:loki_request_duration_seconds:99quantile"
|
||||
},
|
||||
{
|
||||
"expr": "histogram_quantile(0.50, sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, job))",
|
||||
"record": "job:loki_request_duration_seconds:50quantile"
|
||||
},
|
||||
{
|
||||
"expr": "sum(rate(loki_request_duration_seconds_sum[1m])) by (job) / sum(rate(loki_request_duration_seconds_count[1m])) by (job)",
|
||||
"record": "job:loki_request_duration_seconds:avg"
|
||||
},
|
||||
{
|
||||
"expr": "sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, job)",
|
||||
"record": "job:loki_request_duration_seconds_bucket:sum_rate"
|
||||
},
|
||||
{
|
||||
"expr": "sum(rate(loki_request_duration_seconds_sum[1m])) by (job)",
|
||||
"record": "job:loki_request_duration_seconds_sum:sum_rate"
|
||||
},
|
||||
{
|
||||
"expr": "sum(rate(loki_request_duration_seconds_count[1m])) by (job)",
|
||||
"record": "job:loki_request_duration_seconds_count:sum_rate"
|
||||
},
|
||||
{
|
||||
"expr": "histogram_quantile(0.99, sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, job, route))",
|
||||
"record": "job_route:loki_request_duration_seconds:99quantile"
|
||||
},
|
||||
{
|
||||
"expr": "histogram_quantile(0.50, sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, job, route))",
|
||||
"record": "job_route:loki_request_duration_seconds:50quantile"
|
||||
},
|
||||
{
|
||||
"expr": "sum(rate(loki_request_duration_seconds_sum[1m])) by (job, route) / sum(rate(loki_request_duration_seconds_count[1m])) by (job, route)",
|
||||
"record": "job_route:loki_request_duration_seconds:avg"
|
||||
},
|
||||
{
|
||||
"expr": "sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, job, route)",
|
||||
"record": "job_route:loki_request_duration_seconds_bucket:sum_rate"
|
||||
},
|
||||
{
|
||||
"expr": "sum(rate(loki_request_duration_seconds_sum[1m])) by (job, route)",
|
||||
"record": "job_route:loki_request_duration_seconds_sum:sum_rate"
|
||||
},
|
||||
{
|
||||
"expr": "sum(rate(loki_request_duration_seconds_count[1m])) by (job, route)",
|
||||
"record": "job_route:loki_request_duration_seconds_count:sum_rate"
|
||||
},
|
||||
{
|
||||
"expr": "histogram_quantile(0.99, sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, namespace, job, route))",
|
||||
"record": "namespace_job_route:loki_request_duration_seconds:99quantile"
|
||||
},
|
||||
{
|
||||
"expr": "histogram_quantile(0.50, sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, namespace, job, route))",
|
||||
"record": "namespace_job_route:loki_request_duration_seconds:50quantile"
|
||||
},
|
||||
{
|
||||
"expr": "sum(rate(loki_request_duration_seconds_sum[1m])) by (namespace, job, route) / sum(rate(loki_request_duration_seconds_count[1m])) by (namespace, job, route)",
|
||||
"record": "namespace_job_route:loki_request_duration_seconds:avg"
|
||||
},
|
||||
{
|
||||
"expr": "sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, namespace, job, route)",
|
||||
"record": "namespace_job_route:loki_request_duration_seconds_bucket:sum_rate"
|
||||
},
|
||||
{
|
||||
"expr": "sum(rate(loki_request_duration_seconds_sum[1m])) by (namespace, job, route)",
|
||||
"record": "namespace_job_route:loki_request_duration_seconds_sum:sum_rate"
|
||||
},
|
||||
{
|
||||
"expr": "sum(rate(loki_request_duration_seconds_count[1m])) by (namespace, job, route)",
|
||||
"record": "namespace_job_route:loki_request_duration_seconds_count:sum_rate"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "loki_alerts",
|
||||
"rules": [
|
||||
{
|
||||
"alert": "LokiRequestErrors",
|
||||
"annotations": {
|
||||
"message": "{{ $labels.job }} {{ $labels.route }} is experiencing {{ printf \"%.2f\" $value }}% errors.\n"
|
||||
},
|
||||
"expr": "100 * sum(rate(loki_request_duration_seconds_count{status_code=~\"5..\"}[1m])) by (namespace, job, route)\n /\nsum(rate(loki_request_duration_seconds_count[1m])) by (namespace, job, route)\n > 10\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "LokiRequestLatency",
|
||||
"annotations": {
|
||||
"message": "{{ $labels.job }} {{ $labels.route }} is experiencing {{ printf \"%.2f\" $value }}s 99th percentile latency.\n"
|
||||
},
|
||||
"expr": "namespace_job_route:loki_request_duration_seconds:99quantile{route!~\"(?i).*tail.*\"} > 1\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
node-exporter.yaml: |-
|
||||
{
|
||||
"groups": [
|
||||
@ -1607,7 +1527,7 @@ data:
|
||||
"description": "{{ $labels.instance }} interface {{ $labels.device }} has encountered {{ printf \"%.0f\" $value }} receive errors in the last two minutes.",
|
||||
"summary": "Network interface is reporting many receive errors."
|
||||
},
|
||||
"expr": "increase(node_network_receive_errs_total[2m]) > 10\n",
|
||||
"expr": "rate(node_network_receive_errs_total[2m]) / rate(node_network_receive_packets_total[2m]) > 0.01\n",
|
||||
"for": "1h",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
@ -1619,7 +1539,7 @@ data:
|
||||
"description": "{{ $labels.instance }} interface {{ $labels.device }} has encountered {{ printf \"%.0f\" $value }} transmit errors in the last two minutes.",
|
||||
"summary": "Network interface is reporting many transmit errors."
|
||||
},
|
||||
"expr": "increase(node_network_transmit_errs_total[2m]) > 10\n",
|
||||
"expr": "rate(node_network_transmit_errs_total[2m]) / rate(node_network_transmit_packets_total[2m]) > 0.01\n",
|
||||
"for": "1h",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
@ -1665,7 +1585,7 @@ data:
|
||||
"message": "Clock on {{ $labels.instance }} is not synchronising. Ensure NTP is configured on this host.",
|
||||
"summary": "Clock not synchronising."
|
||||
},
|
||||
"expr": "min_over_time(node_timex_sync_status[5m]) == 0\n",
|
||||
"expr": "min_over_time(node_timex_sync_status[5m]) == 0\nand\nnode_timex_maxerror_seconds >= 16\n",
|
||||
"for": "10m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
@ -1740,18 +1660,6 @@ data:
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "PrometheusErrorSendingAlertsToAnyAlertmanager",
|
||||
"annotations": {
|
||||
"description": "{{ printf \"%.1f\" $value }}% minimum errors while sending alerts from Prometheus {{$labels.instance}} to any Alertmanager.",
|
||||
"summary": "Prometheus encounters more than 3% errors sending alerts to any Alertmanager."
|
||||
},
|
||||
"expr": "min without(alertmanager) (\n rate(prometheus_notifications_errors_total{job=\"prometheus\"}[5m])\n/\n rate(prometheus_notifications_sent_total{job=\"prometheus\"}[5m])\n)\n* 100\n> 3\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "PrometheusNotConnectedToAlertmanagers",
|
||||
"annotations": {
|
||||
@ -1794,7 +1702,7 @@ data:
|
||||
"description": "Prometheus {{$labels.instance}} is not ingesting samples.",
|
||||
"summary": "Prometheus is not ingesting samples."
|
||||
},
|
||||
"expr": "rate(prometheus_tsdb_head_samples_appended_total{job=\"prometheus\"}[5m]) <= 0\n",
|
||||
"expr": "(\n rate(prometheus_tsdb_head_samples_appended_total{job=\"prometheus\"}[5m]) <= 0\nand\n (\n sum without(scrape_job) (prometheus_target_metadata_cache_entries{job=\"prometheus\"}) > 0\n or\n sum without(rule_group) (prometheus_rule_group_rules{job=\"prometheus\"}) > 0\n )\n)\n",
|
||||
"for": "10m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
@ -1842,7 +1750,7 @@ data:
|
||||
"description": "Prometheus {{$labels.instance}} remote write is {{ printf \"%.1f\" $value }}s behind for {{ $labels.remote_name}}:{{ $labels.url }}.",
|
||||
"summary": "Prometheus remote write is behind."
|
||||
},
|
||||
"expr": "# Without max_over_time, failed scrapes could create false negatives, see\n# https://www.robustperception.io/alerting-on-gauges-in-prometheus-2-0 for details.\n(\n max_over_time(prometheus_remote_storage_highest_timestamp_in_seconds{job=\"prometheus\"}[5m])\n- on(job, instance) group_right\n max_over_time(prometheus_remote_storage_queue_highest_sent_timestamp_seconds{job=\"prometheus\"}[5m])\n)\n> 120\n",
|
||||
"expr": "# Without max_over_time, failed scrapes could create false negatives, see\n# https://www.robustperception.io/alerting-on-gauges-in-prometheus-2-0 for details.\n(\n max_over_time(prometheus_remote_storage_highest_timestamp_in_seconds{job=\"prometheus\"}[5m])\n- ignoring(remote_name, url) group_right\n max_over_time(prometheus_remote_storage_queue_highest_sent_timestamp_seconds{job=\"prometheus\"}[5m])\n)\n> 120\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
@ -1895,6 +1803,18 @@ data:
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "PrometheusErrorSendingAlertsToAnyAlertmanager",
|
||||
"annotations": {
|
||||
"description": "{{ printf \"%.1f\" $value }}% minimum errors while sending alerts from Prometheus {{$labels.instance}} to any Alertmanager.",
|
||||
"summary": "Prometheus encounters more than 3% errors sending alerts to any Alertmanager."
|
||||
},
|
||||
"expr": "min without (alertmanager) (\n rate(prometheus_notifications_errors_total{job=\"prometheus\",alertmanager!~``}[5m])\n/\n rate(prometheus_notifications_sent_total{job=\"prometheus\",alertmanager!~``}[5m])\n)\n* 100\n> 3\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
|
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.20.0 (upstream)
|
||||
* Kubernetes v1.22.1 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
|
||||
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/fedora-coreos/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=4edd79dd0295e6ffa4c8ed04fd5914d5cb3f1b7c"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d7fd3f62661def56e231602a3b101ff5e9ea8447"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
|
||||
|
@ -62,7 +62,6 @@ data "template_file" "controller-configs" {
|
||||
|
||||
vars = {
|
||||
# Cannot use cyclic dependencies on controllers or their DNS records
|
||||
etcd_arch = var.arch == "arm64" ? "-arm64" : ""
|
||||
etcd_name = "etcd${count.index}"
|
||||
etcd_domain = "${var.cluster_name}-etcd${count.index}.${var.dns_zone}"
|
||||
# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...
|
||||
|
@ -1,6 +1,6 @@
|
||||
---
|
||||
variant: fcos
|
||||
version: 1.1.0
|
||||
version: 1.4.0
|
||||
systemd:
|
||||
units:
|
||||
- name: etcd-member.service
|
||||
@ -12,7 +12,7 @@ systemd:
|
||||
Wants=network-online.target network.target
|
||||
After=network-online.target
|
||||
[Service]
|
||||
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.4.14${etcd_arch}
|
||||
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.5.0
|
||||
Type=exec
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/etcd
|
||||
ExecStartPre=-/usr/bin/podman rm etcd
|
||||
@ -50,10 +50,13 @@ systemd:
|
||||
contents: |
|
||||
[Unit]
|
||||
Description=Kubelet (System Container)
|
||||
Requires=afterburn.service
|
||||
After=afterburn.service
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
EnvironmentFile=/run/metadata/afterburn
|
||||
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
@ -64,12 +67,12 @@ systemd:
|
||||
--privileged \
|
||||
--pid host \
|
||||
--network host \
|
||||
--volume /etc/cni/net.d:/etc/cni/net.d:ro,z \
|
||||
--volume /etc/kubernetes:/etc/kubernetes:ro,z \
|
||||
--volume /usr/lib/os-release:/etc/os-release:ro \
|
||||
--volume /lib/modules:/lib/modules:ro \
|
||||
--volume /run:/run \
|
||||
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \
|
||||
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
|
||||
--volume /sys/fs/cgroup:/sys/fs/cgroup \
|
||||
--volume /var/lib/calico:/var/lib/calico:ro \
|
||||
--volume /var/lib/docker:/var/lib/docker \
|
||||
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
|
||||
@ -87,12 +90,12 @@ systemd:
|
||||
--client-ca-file=/etc/kubernetes/ca.crt \
|
||||
--cluster_dns=${cluster_dns_service_ip} \
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--healthz-port=0 \
|
||||
--kubeconfig=/var/lib/kubelet/kubeconfig \
|
||||
--network-plugin=cni \
|
||||
--node-labels=node.kubernetes.io/controller="true" \
|
||||
--pod-manifest-path=/etc/kubernetes/manifests \
|
||||
--provider-id=aws:///$${AFTERBURN_AWS_AVAILABILITY_ZONE}/$${AFTERBURN_AWS_INSTANCE_ID} \
|
||||
--read-only-port=0 \
|
||||
--register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \
|
||||
--rotate-certificates \
|
||||
@ -119,7 +122,7 @@ systemd:
|
||||
--volume /opt/bootstrap/assets:/assets:ro,Z \
|
||||
--volume /opt/bootstrap/apply:/apply:ro,Z \
|
||||
--entrypoint=/apply \
|
||||
quay.io/poseidon/kubelet:v1.20.0
|
||||
quay.io/poseidon/kubelet:v1.22.1
|
||||
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
|
||||
ExecStartPost=-/usr/bin/podman stop bootstrap
|
||||
storage:
|
||||
|
@ -201,8 +201,8 @@ resource "aws_security_group_rule" "controller-scheduler-metrics" {
|
||||
|
||||
type = "ingress"
|
||||
protocol = "tcp"
|
||||
from_port = 10251
|
||||
to_port = 10251
|
||||
from_port = 10259
|
||||
to_port = 10259
|
||||
source_security_group_id = aws_security_group.worker.id
|
||||
}
|
||||
|
||||
@ -212,8 +212,8 @@ resource "aws_security_group_rule" "controller-manager-metrics" {
|
||||
|
||||
type = "ingress"
|
||||
protocol = "tcp"
|
||||
from_port = 10252
|
||||
to_port = 10252
|
||||
from_port = 10257
|
||||
to_port = 10257
|
||||
source_security_group_id = aws_security_group.worker.id
|
||||
}
|
||||
|
||||
|
@ -55,13 +55,13 @@ variable "os_stream" {
|
||||
variable "disk_size" {
|
||||
type = number
|
||||
description = "Size of the EBS volume in GB"
|
||||
default = 40
|
||||
default = 30
|
||||
}
|
||||
|
||||
variable "disk_type" {
|
||||
type = string
|
||||
description = "Type of the EBS volume (e.g. standard, gp2, io1)"
|
||||
default = "gp2"
|
||||
description = "Type of the EBS volume (e.g. standard, gp2, gp3, io1)"
|
||||
default = "gp3"
|
||||
}
|
||||
|
||||
variable "disk_iops" {
|
||||
@ -84,13 +84,13 @@ variable "worker_target_groups" {
|
||||
|
||||
variable "controller_snippets" {
|
||||
type = list(string)
|
||||
description = "Controller Fedora CoreOS Config snippets"
|
||||
description = "Controller Butane snippets"
|
||||
default = []
|
||||
}
|
||||
|
||||
variable "worker_snippets" {
|
||||
type = list(string)
|
||||
description = "Worker Fedora CoreOS Config snippets"
|
||||
description = "Worker Butane snippets"
|
||||
default = []
|
||||
}
|
||||
|
||||
@ -176,4 +176,3 @@ variable "daemonset_tolerations" {
|
||||
description = "List of additional taint keys kube-system DaemonSets should tolerate (e.g. ['custom-role', 'gpu-role'])"
|
||||
default = []
|
||||
}
|
||||
|
||||
|
@ -1,7 +1,7 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = "~> 0.13.0"
|
||||
required_version = ">= 0.13.0, < 2.0.0"
|
||||
required_providers {
|
||||
aws = ">= 2.23, <= 4.0"
|
||||
template = "~> 2.1"
|
||||
@ -9,7 +9,7 @@ terraform {
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6"
|
||||
version = "~> 0.9"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -1,6 +1,6 @@
|
||||
---
|
||||
variant: fcos
|
||||
version: 1.1.0
|
||||
version: 1.4.0
|
||||
systemd:
|
||||
units:
|
||||
- name: docker.service
|
||||
@ -23,10 +23,13 @@ systemd:
|
||||
contents: |
|
||||
[Unit]
|
||||
Description=Kubelet (System Container)
|
||||
Requires=afterburn.service
|
||||
After=afterburn.service
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
EnvironmentFile=/run/metadata/afterburn
|
||||
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
@ -37,12 +40,12 @@ systemd:
|
||||
--privileged \
|
||||
--pid host \
|
||||
--network host \
|
||||
--volume /etc/cni/net.d:/etc/cni/net.d:ro,z \
|
||||
--volume /etc/kubernetes:/etc/kubernetes:ro,z \
|
||||
--volume /usr/lib/os-release:/etc/os-release:ro \
|
||||
--volume /lib/modules:/lib/modules:ro \
|
||||
--volume /run:/run \
|
||||
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \
|
||||
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
|
||||
--volume /sys/fs/cgroup:/sys/fs/cgroup \
|
||||
--volume /var/lib/calico:/var/lib/calico:ro \
|
||||
--volume /var/lib/docker:/var/lib/docker \
|
||||
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
|
||||
@ -60,7 +63,6 @@ systemd:
|
||||
--client-ca-file=/etc/kubernetes/ca.crt \
|
||||
--cluster_dns=${cluster_dns_service_ip} \
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--healthz-port=0 \
|
||||
--kubeconfig=/var/lib/kubelet/kubeconfig \
|
||||
--network-plugin=cni \
|
||||
@ -72,6 +74,7 @@ systemd:
|
||||
--register-with-taints=${taint} \
|
||||
%{~ endfor ~}
|
||||
--pod-manifest-path=/etc/kubernetes/manifests \
|
||||
--provider-id=aws:///$${AFTERBURN_AWS_AVAILABILITY_ZONE}/$${AFTERBURN_AWS_INSTANCE_ID} \
|
||||
--read-only-port=0 \
|
||||
--rotate-certificates \
|
||||
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
|
||||
@ -87,7 +90,7 @@ systemd:
|
||||
[Unit]
|
||||
Description=Delete Kubernetes node on shutdown
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
Type=oneshot
|
||||
RemainAfterExit=true
|
||||
ExecStart=/bin/true
|
||||
|
@ -48,13 +48,13 @@ variable "os_stream" {
|
||||
variable "disk_size" {
|
||||
type = number
|
||||
description = "Size of the EBS volume in GB"
|
||||
default = 40
|
||||
default = 30
|
||||
}
|
||||
|
||||
variable "disk_type" {
|
||||
type = string
|
||||
description = "Type of the EBS volume (e.g. standard, gp2, io1)"
|
||||
default = "gp2"
|
||||
description = "Type of the EBS volume (e.g. standard, gp2, gp3, io1)"
|
||||
default = "gp3"
|
||||
}
|
||||
|
||||
variable "disk_iops" {
|
||||
@ -77,7 +77,7 @@ variable "target_groups" {
|
||||
|
||||
variable "snippets" {
|
||||
type = list(string)
|
||||
description = "Fedora CoreOS Config snippets"
|
||||
description = "Butane snippets"
|
||||
default = []
|
||||
}
|
||||
|
||||
|
@ -1,14 +1,14 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = "~> 0.13.0"
|
||||
required_version = ">= 0.13.0, < 2.0.0"
|
||||
required_providers {
|
||||
aws = ">= 2.23, <= 4.0"
|
||||
template = "~> 2.1"
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6"
|
||||
version = "~> 0.9"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.20.0 (upstream)
|
||||
* Kubernetes v1.22.1 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/flatcar-linux/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=4edd79dd0295e6ffa4c8ed04fd5914d5cb3f1b7c"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d7fd3f62661def56e231602a3b101ff5e9ea8447"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
|
||||
@ -12,5 +12,6 @@ module "bootstrap" {
|
||||
cluster_domain_suffix = var.cluster_domain_suffix
|
||||
enable_reporting = var.enable_reporting
|
||||
enable_aggregation = var.enable_aggregation
|
||||
daemonset_tolerations = var.daemonset_tolerations
|
||||
}
|
||||
|
||||
|
@ -10,7 +10,7 @@ systemd:
|
||||
Requires=docker.service
|
||||
After=docker.service
|
||||
[Service]
|
||||
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.4.14
|
||||
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.5.0
|
||||
ExecStartPre=/usr/bin/docker run -d \
|
||||
--name etcd \
|
||||
--network host \
|
||||
@ -53,11 +53,13 @@ systemd:
|
||||
Description=Kubelet (System Container)
|
||||
Requires=docker.service
|
||||
After=docker.service
|
||||
Requires=coreos-metadata.service
|
||||
After=coreos-metadata.service
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
EnvironmentFile=/run/metadata/coreos
|
||||
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
@ -68,6 +70,7 @@ systemd:
|
||||
--privileged \
|
||||
--pid host \
|
||||
--network host \
|
||||
-v /etc/cni/net.d:/etc/cni/net.d:ro \
|
||||
-v /etc/kubernetes:/etc/kubernetes:ro \
|
||||
-v /etc/machine-id:/etc/machine-id:ro \
|
||||
-v /usr/lib/os-release:/etc/os-release:ro \
|
||||
@ -85,16 +88,15 @@ systemd:
|
||||
--authentication-token-webhook \
|
||||
--authorization-mode=Webhook \
|
||||
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
|
||||
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
|
||||
--client-ca-file=/etc/kubernetes/ca.crt \
|
||||
--cluster_dns=${cluster_dns_service_ip} \
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--healthz-port=0 \
|
||||
--kubeconfig=/var/lib/kubelet/kubeconfig \
|
||||
--network-plugin=cni \
|
||||
--node-labels=node.kubernetes.io/controller="true" \
|
||||
--pod-manifest-path=/etc/kubernetes/manifests \
|
||||
--provider-id=aws:///$${COREOS_EC2_AVAILABILITY_ZONE}/$${COREOS_EC2_INSTANCE_ID} \
|
||||
--read-only-port=0 \
|
||||
--register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \
|
||||
--rotate-certificates \
|
||||
@ -117,7 +119,7 @@ systemd:
|
||||
Type=oneshot
|
||||
RemainAfterExit=true
|
||||
WorkingDirectory=/opt/bootstrap
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
ExecStart=/usr/bin/docker run \
|
||||
-v /etc/kubernetes/pki:/etc/kubernetes/pki:ro \
|
||||
-v /opt/bootstrap/assets:/assets:ro \
|
||||
|
@ -67,7 +67,6 @@ data "template_file" "controller-configs" {
|
||||
etcd_domain = "${var.cluster_name}-etcd${count.index}.${var.dns_zone}"
|
||||
# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...
|
||||
etcd_initial_cluster = join(",", data.template_file.etcds.*.rendered)
|
||||
cgroup_driver = local.channel == "edge" ? "systemd" : "cgroupfs"
|
||||
kubeconfig = indent(10, module.bootstrap.kubeconfig-kubelet)
|
||||
ssh_authorized_key = var.ssh_authorized_key
|
||||
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
|
||||
|
@ -201,8 +201,8 @@ resource "aws_security_group_rule" "controller-scheduler-metrics" {
|
||||
|
||||
type = "ingress"
|
||||
protocol = "tcp"
|
||||
from_port = 10251
|
||||
to_port = 10251
|
||||
from_port = 10259
|
||||
to_port = 10259
|
||||
source_security_group_id = aws_security_group.worker.id
|
||||
}
|
||||
|
||||
@ -212,8 +212,8 @@ resource "aws_security_group_rule" "controller-manager-metrics" {
|
||||
|
||||
type = "ingress"
|
||||
protocol = "tcp"
|
||||
from_port = 10252
|
||||
to_port = 10252
|
||||
from_port = 10257
|
||||
to_port = 10257
|
||||
source_security_group_id = aws_security_group.worker.id
|
||||
}
|
||||
|
||||
|
@ -43,25 +43,25 @@ variable "worker_type" {
|
||||
|
||||
variable "os_image" {
|
||||
type = string
|
||||
description = "AMI channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge)"
|
||||
description = "AMI channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha)"
|
||||
default = "flatcar-stable"
|
||||
|
||||
validation {
|
||||
condition = contains(["flatcar-stable", "flatcar-beta", "flatcar-alpha", "flatcar-edge"], var.os_image)
|
||||
error_message = "The os_image must be flatcar-stable, flatcar-beta, flatcar-alpha, or flatcar-edge."
|
||||
condition = contains(["flatcar-stable", "flatcar-beta", "flatcar-alpha"], var.os_image)
|
||||
error_message = "The os_image must be flatcar-stable, flatcar-beta, or flatcar-alpha."
|
||||
}
|
||||
}
|
||||
|
||||
variable "disk_size" {
|
||||
type = number
|
||||
description = "Size of the EBS volume in GB"
|
||||
default = 40
|
||||
default = 30
|
||||
}
|
||||
|
||||
variable "disk_type" {
|
||||
type = string
|
||||
description = "Type of the EBS volume (e.g. standard, gp2, io1)"
|
||||
default = "gp2"
|
||||
description = "Type of the EBS volume (e.g. standard, gp2, gp3, io1)"
|
||||
default = "gp3"
|
||||
}
|
||||
|
||||
variable "disk_iops" {
|
||||
@ -160,3 +160,8 @@ variable "cluster_domain_suffix" {
|
||||
default = "cluster.local"
|
||||
}
|
||||
|
||||
variable "daemonset_tolerations" {
|
||||
type = list(string)
|
||||
description = "List of additional taint keys kube-system DaemonSets should tolerate (e.g. ['custom-role', 'gpu-role'])"
|
||||
default = []
|
||||
}
|
||||
|
@ -1,7 +1,7 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = "~> 0.13.0"
|
||||
required_version = ">= 0.13.0, < 2.0.0"
|
||||
required_providers {
|
||||
aws = ">= 2.23, <= 4.0"
|
||||
template = "~> 2.1"
|
||||
@ -9,7 +9,7 @@ terraform {
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6"
|
||||
version = "~> 0.9"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -25,11 +25,13 @@ systemd:
|
||||
Description=Kubelet
|
||||
Requires=docker.service
|
||||
After=docker.service
|
||||
Requires=coreos-metadata.service
|
||||
After=coreos-metadata.service
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
EnvironmentFile=/run/metadata/coreos
|
||||
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
@ -43,6 +45,7 @@ systemd:
|
||||
--privileged \
|
||||
--pid host \
|
||||
--network host \
|
||||
-v /etc/cni/net.d:/etc/cni/net.d:ro \
|
||||
-v /etc/kubernetes:/etc/kubernetes:ro \
|
||||
-v /etc/machine-id:/etc/machine-id:ro \
|
||||
-v /usr/lib/os-release:/etc/os-release:ro \
|
||||
@ -60,11 +63,9 @@ systemd:
|
||||
--authentication-token-webhook \
|
||||
--authorization-mode=Webhook \
|
||||
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
|
||||
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
|
||||
--client-ca-file=/etc/kubernetes/ca.crt \
|
||||
--cluster_dns=${cluster_dns_service_ip} \
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--healthz-port=0 \
|
||||
--kubeconfig=/var/lib/kubelet/kubeconfig \
|
||||
--network-plugin=cni \
|
||||
@ -72,7 +73,11 @@ systemd:
|
||||
%{~ for label in split(",", node_labels) ~}
|
||||
--node-labels=${label} \
|
||||
%{~ endfor ~}
|
||||
%{~ for taint in split(",", node_taints) ~}
|
||||
--register-with-taints=${taint} \
|
||||
%{~ endfor ~}
|
||||
--pod-manifest-path=/etc/kubernetes/manifests \
|
||||
--provider-id=aws:///$${COREOS_EC2_AVAILABILITY_ZONE}/$${COREOS_EC2_INSTANCE_ID} \
|
||||
--read-only-port=0 \
|
||||
--rotate-certificates \
|
||||
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
|
||||
@ -89,7 +94,7 @@ systemd:
|
||||
[Unit]
|
||||
Description=Delete Kubernetes node on shutdown
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
Type=oneshot
|
||||
RemainAfterExit=true
|
||||
ExecStart=/bin/true
|
||||
|
@ -36,25 +36,25 @@ variable "instance_type" {
|
||||
|
||||
variable "os_image" {
|
||||
type = string
|
||||
description = "AMI channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge)"
|
||||
description = "AMI channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha)"
|
||||
default = "flatcar-stable"
|
||||
|
||||
validation {
|
||||
condition = contains(["flatcar-stable", "flatcar-beta", "flatcar-alpha", "flatcar-edge"], var.os_image)
|
||||
error_message = "The os_image must be flatcar-stable, flatcar-beta, flatcar-alpha, or flatcar-edge."
|
||||
condition = contains(["flatcar-stable", "flatcar-beta", "flatcar-alpha"], var.os_image)
|
||||
error_message = "The os_image must be flatcar-stable, flatcar-beta, or flatcar-alpha."
|
||||
}
|
||||
}
|
||||
|
||||
variable "disk_size" {
|
||||
type = number
|
||||
description = "Size of the EBS volume in GB"
|
||||
default = 40
|
||||
default = 30
|
||||
}
|
||||
|
||||
variable "disk_type" {
|
||||
type = string
|
||||
description = "Type of the EBS volume (e.g. standard, gp2, io1)"
|
||||
default = "gp2"
|
||||
description = "Type of the EBS volume (e.g. standard, gp2, gp3, io1)"
|
||||
default = "gp3"
|
||||
}
|
||||
|
||||
variable "disk_iops" {
|
||||
@ -113,3 +113,9 @@ variable "node_labels" {
|
||||
description = "List of initial node labels"
|
||||
default = []
|
||||
}
|
||||
|
||||
variable "node_taints" {
|
||||
type = list(string)
|
||||
description = "List of initial node taints"
|
||||
default = []
|
||||
}
|
||||
|
@ -1,14 +1,14 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = "~> 0.13.0"
|
||||
required_version = ">= 0.13.0, < 2.0.0"
|
||||
required_providers {
|
||||
aws = ">= 2.23, <= 4.0"
|
||||
template = "~> 2.1"
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6"
|
||||
version = "~> 0.9"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -85,8 +85,8 @@ data "template_file" "worker-config" {
|
||||
ssh_authorized_key = var.ssh_authorized_key
|
||||
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
|
||||
cluster_domain_suffix = var.cluster_domain_suffix
|
||||
cgroup_driver = local.channel == "edge" ? "systemd" : "cgroupfs"
|
||||
node_labels = join(",", var.node_labels)
|
||||
node_taints = join(",", var.node_taints)
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.20.0 (upstream)
|
||||
* Kubernetes v1.22.1 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
|
||||
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot priority](https://typhoon.psdn.io/fedora-coreos/azure/#low-priority) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=4edd79dd0295e6ffa4c8ed04fd5914d5cb3f1b7c"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d7fd3f62661def56e231602a3b101ff5e9ea8447"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
|
||||
@ -18,6 +18,7 @@ module "bootstrap" {
|
||||
cluster_domain_suffix = var.cluster_domain_suffix
|
||||
enable_reporting = var.enable_reporting
|
||||
enable_aggregation = var.enable_aggregation
|
||||
daemonset_tolerations = var.daemonset_tolerations
|
||||
|
||||
# Fedora CoreOS
|
||||
trusted_certs_dir = "/etc/pki/tls/certs"
|
||||
|
@ -1,6 +1,6 @@
|
||||
---
|
||||
variant: fcos
|
||||
version: 1.1.0
|
||||
version: 1.4.0
|
||||
systemd:
|
||||
units:
|
||||
- name: etcd-member.service
|
||||
@ -12,7 +12,7 @@ systemd:
|
||||
Wants=network-online.target network.target
|
||||
After=network-online.target
|
||||
[Service]
|
||||
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.4.14
|
||||
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.5.0
|
||||
Type=exec
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/etcd
|
||||
ExecStartPre=-/usr/bin/podman rm etcd
|
||||
@ -51,8 +51,8 @@ systemd:
|
||||
Description=Kubelet (System Container)
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
@ -63,12 +63,12 @@ systemd:
|
||||
--privileged \
|
||||
--pid host \
|
||||
--network host \
|
||||
--volume /etc/cni/net.d:/etc/cni/net.d:ro,z \
|
||||
--volume /etc/kubernetes:/etc/kubernetes:ro,z \
|
||||
--volume /usr/lib/os-release:/etc/os-release:ro \
|
||||
--volume /lib/modules:/lib/modules:ro \
|
||||
--volume /run:/run \
|
||||
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \
|
||||
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
|
||||
--volume /sys/fs/cgroup:/sys/fs/cgroup \
|
||||
--volume /var/lib/calico:/var/lib/calico:ro \
|
||||
--volume /var/lib/docker:/var/lib/docker \
|
||||
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
|
||||
@ -86,7 +86,6 @@ systemd:
|
||||
--client-ca-file=/etc/kubernetes/ca.crt \
|
||||
--cluster_dns=${cluster_dns_service_ip} \
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--healthz-port=0 \
|
||||
--kubeconfig=/var/lib/kubelet/kubeconfig \
|
||||
--network-plugin=cni \
|
||||
@ -118,7 +117,7 @@ systemd:
|
||||
--volume /opt/bootstrap/assets:/assets:ro,Z \
|
||||
--volume /opt/bootstrap/apply:/apply:ro,Z \
|
||||
--entrypoint=/apply \
|
||||
quay.io/poseidon/kubelet:v1.20.0
|
||||
quay.io/poseidon/kubelet:v1.22.1
|
||||
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
|
||||
ExecStartPost=-/usr/bin/podman stop bootstrap
|
||||
storage:
|
||||
|
@ -112,16 +112,12 @@ resource "azurerm_lb_outbound_rule" "worker-outbound" {
|
||||
|
||||
# Address pool of controllers
|
||||
resource "azurerm_lb_backend_address_pool" "controller" {
|
||||
resource_group_name = azurerm_resource_group.cluster.name
|
||||
|
||||
name = "controller"
|
||||
loadbalancer_id = azurerm_lb.cluster.id
|
||||
}
|
||||
|
||||
# Address pool of workers
|
||||
resource "azurerm_lb_backend_address_pool" "worker" {
|
||||
resource_group_name = azurerm_resource_group.cluster.name
|
||||
|
||||
name = "worker"
|
||||
loadbalancer_id = azurerm_lb.cluster.id
|
||||
}
|
||||
|
@ -95,7 +95,7 @@ resource "azurerm_network_security_rule" "controller-kube-metrics" {
|
||||
direction = "Inbound"
|
||||
protocol = "Tcp"
|
||||
source_port_range = "*"
|
||||
destination_port_range = "10251-10252"
|
||||
destination_port_range = "10257-10259"
|
||||
source_address_prefix = azurerm_subnet.worker.address_prefix
|
||||
destination_address_prefix = azurerm_subnet.controller.address_prefix
|
||||
}
|
||||
|
@ -54,7 +54,7 @@ variable "os_image" {
|
||||
variable "disk_size" {
|
||||
type = number
|
||||
description = "Size of the disk in GB"
|
||||
default = 40
|
||||
default = 30
|
||||
}
|
||||
|
||||
variable "worker_priority" {
|
||||
@ -65,13 +65,13 @@ variable "worker_priority" {
|
||||
|
||||
variable "controller_snippets" {
|
||||
type = list(string)
|
||||
description = "Controller Fedora CoreOS Config snippets"
|
||||
description = "Controller Butane snippets"
|
||||
default = []
|
||||
}
|
||||
|
||||
variable "worker_snippets" {
|
||||
type = list(string)
|
||||
description = "Worker Fedora CoreOS Config snippets"
|
||||
description = "Worker Butane snippets"
|
||||
default = []
|
||||
}
|
||||
|
||||
@ -135,3 +135,8 @@ variable "cluster_domain_suffix" {
|
||||
default = "cluster.local"
|
||||
}
|
||||
|
||||
variable "daemonset_tolerations" {
|
||||
type = list(string)
|
||||
description = "List of additional taint keys kube-system DaemonSets should tolerate (e.g. ['custom-role', 'gpu-role'])"
|
||||
default = []
|
||||
}
|
||||
|
@ -1,7 +1,7 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = "~> 0.13.0"
|
||||
required_version = ">= 0.13.0, < 2.0.0"
|
||||
required_providers {
|
||||
azurerm = "~> 2.8"
|
||||
template = "~> 2.1"
|
||||
@ -9,7 +9,7 @@ terraform {
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6"
|
||||
version = "~> 0.9"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -1,6 +1,6 @@
|
||||
---
|
||||
variant: fcos
|
||||
version: 1.1.0
|
||||
version: 1.4.0
|
||||
systemd:
|
||||
units:
|
||||
- name: docker.service
|
||||
@ -24,8 +24,8 @@ systemd:
|
||||
Description=Kubelet (System Container)
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
@ -36,12 +36,12 @@ systemd:
|
||||
--privileged \
|
||||
--pid host \
|
||||
--network host \
|
||||
--volume /etc/cni/net.d:/etc/cni/net.d:ro,z \
|
||||
--volume /etc/kubernetes:/etc/kubernetes:ro,z \
|
||||
--volume /usr/lib/os-release:/etc/os-release:ro \
|
||||
--volume /lib/modules:/lib/modules:ro \
|
||||
--volume /run:/run \
|
||||
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \
|
||||
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
|
||||
--volume /sys/fs/cgroup:/sys/fs/cgroup \
|
||||
--volume /var/lib/calico:/var/lib/calico:ro \
|
||||
--volume /var/lib/docker:/var/lib/docker \
|
||||
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
|
||||
@ -59,7 +59,6 @@ systemd:
|
||||
--client-ca-file=/etc/kubernetes/ca.crt \
|
||||
--cluster_dns=${cluster_dns_service_ip} \
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--healthz-port=0 \
|
||||
--kubeconfig=/var/lib/kubelet/kubeconfig \
|
||||
--network-plugin=cni \
|
||||
@ -67,6 +66,9 @@ systemd:
|
||||
%{~ for label in split(",", node_labels) ~}
|
||||
--node-labels=${label} \
|
||||
%{~ endfor ~}
|
||||
%{~ for taint in split(",", node_taints) ~}
|
||||
--register-with-taints=${taint} \
|
||||
%{~ endfor ~}
|
||||
--pod-manifest-path=/etc/kubernetes/manifests \
|
||||
--read-only-port=0 \
|
||||
--rotate-certificates \
|
||||
@ -83,7 +85,7 @@ systemd:
|
||||
[Unit]
|
||||
Description=Delete Kubernetes node on shutdown
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
Type=oneshot
|
||||
RemainAfterExit=true
|
||||
ExecStart=/bin/true
|
||||
|
@ -57,7 +57,7 @@ variable "priority" {
|
||||
|
||||
variable "snippets" {
|
||||
type = list(string)
|
||||
description = "Fedora CoreOS Config snippets"
|
||||
description = "Butane snippets"
|
||||
default = []
|
||||
}
|
||||
|
||||
@ -88,6 +88,12 @@ variable "node_labels" {
|
||||
default = []
|
||||
}
|
||||
|
||||
variable "node_taints" {
|
||||
type = list(string)
|
||||
description = "List of initial node taints"
|
||||
default = []
|
||||
}
|
||||
|
||||
# unofficial, undocumented, unsupported
|
||||
|
||||
variable "cluster_domain_suffix" {
|
||||
|
@ -1,14 +1,14 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = "~> 0.13.0"
|
||||
required_version = ">= 0.13.0, < 2.0.0"
|
||||
required_providers {
|
||||
azurerm = "~> 2.8"
|
||||
template = "~> 2.1"
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6"
|
||||
version = "~> 0.9"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -87,6 +87,7 @@ data "template_file" "worker-config" {
|
||||
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
|
||||
cluster_domain_suffix = var.cluster_domain_suffix
|
||||
node_labels = join(",", var.node_labels)
|
||||
node_taints = join(",", var.node_taints)
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.20.0 (upstream)
|
||||
* Kubernetes v1.22.1 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [low-priority](https://typhoon.psdn.io/flatcar-linux/azure/#low-priority) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=4edd79dd0295e6ffa4c8ed04fd5914d5cb3f1b7c"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d7fd3f62661def56e231602a3b101ff5e9ea8447"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
|
||||
@ -18,5 +18,6 @@ module "bootstrap" {
|
||||
cluster_domain_suffix = var.cluster_domain_suffix
|
||||
enable_reporting = var.enable_reporting
|
||||
enable_aggregation = var.enable_aggregation
|
||||
daemonset_tolerations = var.daemonset_tolerations
|
||||
}
|
||||
|
||||
|
@ -10,7 +10,7 @@ systemd:
|
||||
Requires=docker.service
|
||||
After=docker.service
|
||||
[Service]
|
||||
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.4.14
|
||||
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.5.0
|
||||
ExecStartPre=/usr/bin/docker run -d \
|
||||
--name etcd \
|
||||
--network host \
|
||||
@ -55,9 +55,8 @@ systemd:
|
||||
After=docker.service
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
@ -68,6 +67,7 @@ systemd:
|
||||
--privileged \
|
||||
--pid host \
|
||||
--network host \
|
||||
-v /etc/cni/net.d:/etc/cni/net.d:ro \
|
||||
-v /etc/kubernetes:/etc/kubernetes:ro \
|
||||
-v /etc/machine-id:/etc/machine-id:ro \
|
||||
-v /usr/lib/os-release:/etc/os-release:ro \
|
||||
@ -85,11 +85,9 @@ systemd:
|
||||
--authentication-token-webhook \
|
||||
--authorization-mode=Webhook \
|
||||
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
|
||||
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
|
||||
--client-ca-file=/etc/kubernetes/ca.crt \
|
||||
--cluster_dns=${cluster_dns_service_ip} \
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--healthz-port=0 \
|
||||
--kubeconfig=/var/lib/kubelet/kubeconfig \
|
||||
--network-plugin=cni \
|
||||
@ -117,7 +115,7 @@ systemd:
|
||||
Type=oneshot
|
||||
RemainAfterExit=true
|
||||
WorkingDirectory=/opt/bootstrap
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
ExecStart=/usr/bin/docker run \
|
||||
-v /etc/kubernetes/pki:/etc/kubernetes/pki:ro \
|
||||
-v /opt/bootstrap/assets:/assets:ro \
|
||||
|
@ -150,7 +150,6 @@ data "template_file" "controller-configs" {
|
||||
etcd_domain = "${var.cluster_name}-etcd${count.index}.${var.dns_zone}"
|
||||
# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...
|
||||
etcd_initial_cluster = join(",", data.template_file.etcds.*.rendered)
|
||||
cgroup_driver = local.channel == "edge" ? "systemd" : "cgroupfs"
|
||||
kubeconfig = indent(10, module.bootstrap.kubeconfig-kubelet)
|
||||
ssh_authorized_key = var.ssh_authorized_key
|
||||
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
|
||||
|
@ -112,16 +112,12 @@ resource "azurerm_lb_outbound_rule" "worker-outbound" {
|
||||
|
||||
# Address pool of controllers
|
||||
resource "azurerm_lb_backend_address_pool" "controller" {
|
||||
resource_group_name = azurerm_resource_group.cluster.name
|
||||
|
||||
name = "controller"
|
||||
loadbalancer_id = azurerm_lb.cluster.id
|
||||
}
|
||||
|
||||
# Address pool of workers
|
||||
resource "azurerm_lb_backend_address_pool" "worker" {
|
||||
resource_group_name = azurerm_resource_group.cluster.name
|
||||
|
||||
name = "worker"
|
||||
loadbalancer_id = azurerm_lb.cluster.id
|
||||
}
|
||||
|
@ -95,7 +95,7 @@ resource "azurerm_network_security_rule" "controller-kube-metrics" {
|
||||
direction = "Inbound"
|
||||
protocol = "Tcp"
|
||||
source_port_range = "*"
|
||||
destination_port_range = "10251-10252"
|
||||
destination_port_range = "10257-10259"
|
||||
source_address_prefix = azurerm_subnet.worker.address_prefix
|
||||
destination_address_prefix = azurerm_subnet.controller.address_prefix
|
||||
}
|
||||
|
@ -48,19 +48,19 @@ variable "worker_type" {
|
||||
|
||||
variable "os_image" {
|
||||
type = string
|
||||
description = "Channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge)"
|
||||
description = "Channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha)"
|
||||
default = "flatcar-stable"
|
||||
|
||||
validation {
|
||||
condition = contains(["flatcar-stable", "flatcar-beta", "flatcar-alpha", "flatcar-edge"], var.os_image)
|
||||
error_message = "The os_image must be flatcar-stable, flatcar-beta, flatcar-alpha, or flatcar-edge."
|
||||
condition = contains(["flatcar-stable", "flatcar-beta", "flatcar-alpha"], var.os_image)
|
||||
error_message = "The os_image must be flatcar-stable, flatcar-beta, or flatcar-alpha."
|
||||
}
|
||||
}
|
||||
|
||||
variable "disk_size" {
|
||||
type = number
|
||||
description = "Size of the disk in GB"
|
||||
default = 40
|
||||
default = 30
|
||||
}
|
||||
|
||||
variable "worker_priority" {
|
||||
@ -141,3 +141,8 @@ variable "cluster_domain_suffix" {
|
||||
default = "cluster.local"
|
||||
}
|
||||
|
||||
variable "daemonset_tolerations" {
|
||||
type = list(string)
|
||||
description = "List of additional taint keys kube-system DaemonSets should tolerate (e.g. ['custom-role', 'gpu-role'])"
|
||||
default = []
|
||||
}
|
||||
|
@ -1,7 +1,7 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = "~> 0.13.0"
|
||||
required_version = ">= 0.13.0, < 2.0.0"
|
||||
required_providers {
|
||||
azurerm = "~> 2.8"
|
||||
template = "~> 2.1"
|
||||
@ -9,7 +9,7 @@ terraform {
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6"
|
||||
version = "~> 0.9"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -27,9 +27,8 @@ systemd:
|
||||
After=docker.service
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
@ -43,6 +42,7 @@ systemd:
|
||||
--privileged \
|
||||
--pid host \
|
||||
--network host \
|
||||
-v /etc/cni/net.d:/etc/cni/net.d:ro \
|
||||
-v /etc/kubernetes:/etc/kubernetes:ro \
|
||||
-v /etc/machine-id:/etc/machine-id:ro \
|
||||
-v /usr/lib/os-release:/etc/os-release:ro \
|
||||
@ -60,11 +60,9 @@ systemd:
|
||||
--authentication-token-webhook \
|
||||
--authorization-mode=Webhook \
|
||||
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
|
||||
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
|
||||
--client-ca-file=/etc/kubernetes/ca.crt \
|
||||
--cluster_dns=${cluster_dns_service_ip} \
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--healthz-port=0 \
|
||||
--kubeconfig=/var/lib/kubelet/kubeconfig \
|
||||
--network-plugin=cni \
|
||||
@ -72,6 +70,9 @@ systemd:
|
||||
%{~ for label in split(",", node_labels) ~}
|
||||
--node-labels=${label} \
|
||||
%{~ endfor ~}
|
||||
%{~ for taint in split(",", node_taints) ~}
|
||||
--register-with-taints=${taint} \
|
||||
%{~ endfor ~}
|
||||
--pod-manifest-path=/etc/kubernetes/manifests \
|
||||
--read-only-port=0 \
|
||||
--rotate-certificates \
|
||||
@ -89,7 +90,7 @@ systemd:
|
||||
[Unit]
|
||||
Description=Delete Kubernetes node on shutdown
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
Type=oneshot
|
||||
RemainAfterExit=true
|
||||
ExecStart=/bin/true
|
||||
|
@ -46,12 +46,12 @@ variable "vm_type" {
|
||||
|
||||
variable "os_image" {
|
||||
type = string
|
||||
description = "Channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge)"
|
||||
description = "Channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha)"
|
||||
default = "flatcar-stable"
|
||||
|
||||
validation {
|
||||
condition = contains(["flatcar-stable", "flatcar-beta", "flatcar-alpha", "flatcar-edge"], var.os_image)
|
||||
error_message = "The os_image must be flatcar-stable, flatcar-beta, flatcar-alpha, or flatcar-edge."
|
||||
condition = contains(["flatcar-stable", "flatcar-beta", "flatcar-alpha"], var.os_image)
|
||||
error_message = "The os_image must be flatcar-stable, flatcar-beta, or flatcar-alpha."
|
||||
}
|
||||
}
|
||||
|
||||
@ -94,6 +94,12 @@ variable "node_labels" {
|
||||
default = []
|
||||
}
|
||||
|
||||
variable "node_taints" {
|
||||
type = list(string)
|
||||
description = "List of initial node taints"
|
||||
default = []
|
||||
}
|
||||
|
||||
# unofficial, undocumented, unsupported
|
||||
|
||||
variable "cluster_domain_suffix" {
|
||||
|
@ -1,14 +1,14 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = "~> 0.13.0"
|
||||
required_version = ">= 0.13.0, < 2.0.0"
|
||||
required_providers {
|
||||
azurerm = "~> 2.8"
|
||||
template = "~> 2.1"
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6"
|
||||
version = "~> 0.9"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -104,8 +104,8 @@ data "template_file" "worker-config" {
|
||||
ssh_authorized_key = var.ssh_authorized_key
|
||||
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
|
||||
cluster_domain_suffix = var.cluster_domain_suffix
|
||||
cgroup_driver = local.channel == "edge" ? "systemd" : "cgroupfs"
|
||||
node_labels = join(",", var.node_labels)
|
||||
node_taints = join(",", var.node_taints)
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.20.0 (upstream)
|
||||
* Kubernetes v1.22.1 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
|
||||
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=4edd79dd0295e6ffa4c8ed04fd5914d5cb3f1b7c"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d7fd3f62661def56e231602a3b101ff5e9ea8447"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [var.k8s_domain_name]
|
||||
|
@ -1,6 +1,6 @@
|
||||
---
|
||||
variant: fcos
|
||||
version: 1.1.0
|
||||
version: 1.4.0
|
||||
systemd:
|
||||
units:
|
||||
- name: etcd-member.service
|
||||
@ -12,7 +12,7 @@ systemd:
|
||||
Wants=network-online.target network.target
|
||||
After=network-online.target
|
||||
[Service]
|
||||
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.4.14
|
||||
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.5.0
|
||||
Type=exec
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/etcd
|
||||
ExecStartPre=-/usr/bin/podman rm etcd
|
||||
@ -50,8 +50,8 @@ systemd:
|
||||
Description=Kubelet (System Container)
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
@ -62,12 +62,12 @@ systemd:
|
||||
--privileged \
|
||||
--pid host \
|
||||
--network host \
|
||||
--volume /etc/cni/net.d:/etc/cni/net.d:ro,z \
|
||||
--volume /etc/kubernetes:/etc/kubernetes:ro,z \
|
||||
--volume /usr/lib/os-release:/etc/os-release:ro \
|
||||
--volume /lib/modules:/lib/modules:ro \
|
||||
--volume /run:/run \
|
||||
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \
|
||||
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
|
||||
--volume /sys/fs/cgroup:/sys/fs/cgroup \
|
||||
--volume /var/lib/calico:/var/lib/calico:ro \
|
||||
--volume /var/lib/docker:/var/lib/docker \
|
||||
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
|
||||
@ -85,7 +85,6 @@ systemd:
|
||||
--client-ca-file=/etc/kubernetes/ca.crt \
|
||||
--cluster_dns=${cluster_dns_service_ip} \
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--healthz-port=0 \
|
||||
--hostname-override=${domain_name} \
|
||||
--kubeconfig=/var/lib/kubelet/kubeconfig \
|
||||
@ -120,6 +119,7 @@ systemd:
|
||||
Type=oneshot
|
||||
RemainAfterExit=true
|
||||
WorkingDirectory=/opt/bootstrap
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
ExecStartPre=-/usr/bin/podman rm bootstrap
|
||||
ExecStart=/usr/bin/podman run --name bootstrap \
|
||||
--network host \
|
||||
@ -127,7 +127,7 @@ systemd:
|
||||
--volume /opt/bootstrap/assets:/assets:ro,Z \
|
||||
--volume /opt/bootstrap/apply:/apply:ro,Z \
|
||||
--entrypoint=/apply \
|
||||
quay.io/poseidon/kubelet:v1.20.0
|
||||
$${KUBELET_IMAGE}
|
||||
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
|
||||
ExecStartPost=-/usr/bin/podman stop bootstrap
|
||||
storage:
|
||||
|
@ -1,6 +1,6 @@
|
||||
---
|
||||
variant: fcos
|
||||
version: 1.1.0
|
||||
version: 1.4.0
|
||||
systemd:
|
||||
units:
|
||||
- name: docker.service
|
||||
@ -23,8 +23,8 @@ systemd:
|
||||
Description=Kubelet (System Container)
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
@ -35,12 +35,12 @@ systemd:
|
||||
--privileged \
|
||||
--pid host \
|
||||
--network host \
|
||||
--volume /etc/cni/net.d:/etc/cni/net.d:ro,z \
|
||||
--volume /etc/kubernetes:/etc/kubernetes:ro,z \
|
||||
--volume /usr/lib/os-release:/etc/os-release:ro \
|
||||
--volume /lib/modules:/lib/modules:ro \
|
||||
--volume /run:/run \
|
||||
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \
|
||||
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
|
||||
--volume /sys/fs/cgroup:/sys/fs/cgroup \
|
||||
--volume /var/lib/calico:/var/lib/calico:ro \
|
||||
--volume /var/lib/docker:/var/lib/docker \
|
||||
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
|
||||
@ -58,7 +58,6 @@ systemd:
|
||||
--client-ca-file=/etc/kubernetes/ca.crt \
|
||||
--cluster_dns=${cluster_dns_service_ip} \
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--healthz-port=0 \
|
||||
--hostname-override=${domain_name} \
|
||||
--kubeconfig=/var/lib/kubelet/kubeconfig \
|
||||
|
@ -39,6 +39,7 @@ resource "null_resource" "copy-controller-secrets" {
|
||||
provisioner "remote-exec" {
|
||||
inline = [
|
||||
"sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
|
||||
"sudo touch /etc/kubernetes",
|
||||
"sudo /opt/bootstrap/layout",
|
||||
]
|
||||
}
|
||||
@ -70,6 +71,7 @@ resource "null_resource" "copy-worker-secrets" {
|
||||
provisioner "remote-exec" {
|
||||
inline = [
|
||||
"sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
|
||||
"sudo touch /etc/kubernetes",
|
||||
]
|
||||
}
|
||||
}
|
||||
|
@ -57,7 +57,7 @@ EOD
|
||||
|
||||
variable "snippets" {
|
||||
type = map(list(string))
|
||||
description = "Map from machine names to lists of Fedora CoreOS Config snippets"
|
||||
description = "Map from machine names to lists of Butane snippets"
|
||||
default = {}
|
||||
}
|
||||
|
||||
|
@ -1,14 +1,14 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = "~> 0.13.0"
|
||||
required_version = ">= 0.13.0, < 2.0.0"
|
||||
required_providers {
|
||||
template = "~> 2.1"
|
||||
null = "~> 2.1"
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6"
|
||||
version = "~> 0.9"
|
||||
}
|
||||
|
||||
matchbox = {
|
||||
|
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.20.0 (upstream)
|
||||
* Kubernetes v1.22.1 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=4edd79dd0295e6ffa4c8ed04fd5914d5cb3f1b7c"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d7fd3f62661def56e231602a3b101ff5e9ea8447"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [var.k8s_domain_name]
|
||||
|
@ -10,7 +10,7 @@ systemd:
|
||||
Requires=docker.service
|
||||
After=docker.service
|
||||
[Service]
|
||||
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.4.14
|
||||
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.5.0
|
||||
ExecStartPre=/usr/bin/docker run -d \
|
||||
--name etcd \
|
||||
--network host \
|
||||
@ -63,9 +63,8 @@ systemd:
|
||||
After=docker.service
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
@ -76,6 +75,7 @@ systemd:
|
||||
--privileged \
|
||||
--pid host \
|
||||
--network host \
|
||||
-v /etc/cni/net.d:/etc/cni/net.d:ro \
|
||||
-v /etc/kubernetes:/etc/kubernetes:ro \
|
||||
-v /etc/machine-id:/etc/machine-id:ro \
|
||||
-v /usr/lib/os-release:/etc/os-release:ro \
|
||||
@ -93,11 +93,9 @@ systemd:
|
||||
--authentication-token-webhook \
|
||||
--authorization-mode=Webhook \
|
||||
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
|
||||
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
|
||||
--client-ca-file=/etc/kubernetes/ca.crt \
|
||||
--cluster_dns=${cluster_dns_service_ip} \
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--healthz-port=0 \
|
||||
--hostname-override=${domain_name} \
|
||||
--kubeconfig=/var/lib/kubelet/kubeconfig \
|
||||
@ -126,7 +124,7 @@ systemd:
|
||||
Type=oneshot
|
||||
RemainAfterExit=true
|
||||
WorkingDirectory=/opt/bootstrap
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
ExecStart=/usr/bin/docker run \
|
||||
-v /etc/kubernetes/pki:/etc/kubernetes/pki:ro \
|
||||
-v /opt/bootstrap/assets:/assets:ro \
|
||||
|
@ -35,9 +35,8 @@ systemd:
|
||||
After=docker.service
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
@ -51,6 +50,7 @@ systemd:
|
||||
--privileged \
|
||||
--pid host \
|
||||
--network host \
|
||||
-v /etc/cni/net.d:/etc/cni/net.d:ro \
|
||||
-v /etc/kubernetes:/etc/kubernetes:ro \
|
||||
-v /etc/machine-id:/etc/machine-id:ro \
|
||||
-v /usr/lib/os-release:/etc/os-release:ro \
|
||||
@ -68,11 +68,9 @@ systemd:
|
||||
--authentication-token-webhook \
|
||||
--authorization-mode=Webhook \
|
||||
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
|
||||
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
|
||||
--client-ca-file=/etc/kubernetes/ca.crt \
|
||||
--cluster_dns=${cluster_dns_service_ip} \
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--healthz-port=0 \
|
||||
--hostname-override=${domain_name} \
|
||||
--kubeconfig=/var/lib/kubelet/kubeconfig \
|
||||
|
@ -106,7 +106,6 @@ data "template_file" "controller-configs" {
|
||||
domain_name = var.controllers.*.domain[count.index]
|
||||
etcd_name = var.controllers.*.name[count.index]
|
||||
etcd_initial_cluster = join(",", formatlist("%s=https://%s:2380", var.controllers.*.name, var.controllers.*.domain))
|
||||
cgroup_driver = var.os_channel == "flatcar-edge" ? "systemd" : "cgroupfs"
|
||||
cluster_dns_service_ip = module.bootstrap.cluster_dns_service_ip
|
||||
cluster_domain_suffix = var.cluster_domain_suffix
|
||||
ssh_authorized_key = var.ssh_authorized_key
|
||||
@ -134,7 +133,6 @@ data "template_file" "worker-configs" {
|
||||
|
||||
vars = {
|
||||
domain_name = var.workers.*.domain[count.index]
|
||||
cgroup_driver = var.os_channel == "flatcar-edge" ? "systemd" : "cgroupfs"
|
||||
cluster_dns_service_ip = module.bootstrap.cluster_dns_service_ip
|
||||
cluster_domain_suffix = var.cluster_domain_suffix
|
||||
ssh_authorized_key = var.ssh_authorized_key
|
||||
|
@ -12,11 +12,11 @@ variable "matchbox_http_endpoint" {
|
||||
|
||||
variable "os_channel" {
|
||||
type = string
|
||||
description = "Channel for a Flatcar Linux (flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge)"
|
||||
description = "Channel for a Flatcar Linux (flatcar-stable, flatcar-beta, flatcar-alpha)"
|
||||
|
||||
validation {
|
||||
condition = contains(["flatcar-stable", "flatcar-beta", "flatcar-alpha", "flatcar-edge"], var.os_channel)
|
||||
error_message = "The os_channel must be flatcar-stable, flatcar-beta, flatcar-alpha, or flatcar-edge."
|
||||
condition = contains(["flatcar-stable", "flatcar-beta", "flatcar-alpha"], var.os_channel)
|
||||
error_message = "The os_channel must be flatcar-stable, flatcar-beta, or flatcar-alpha."
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -1,14 +1,14 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = "~> 0.13.0"
|
||||
required_version = ">= 0.13.0, < 2.0.0"
|
||||
required_providers {
|
||||
template = "~> 2.1"
|
||||
null = "~> 2.1"
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6"
|
||||
version = "~> 0.9"
|
||||
}
|
||||
|
||||
matchbox = {
|
||||
|
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.20.0 (upstream)
|
||||
* Kubernetes v1.22.1 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
|
||||
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=4edd79dd0295e6ffa4c8ed04fd5914d5cb3f1b7c"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d7fd3f62661def56e231602a3b101ff5e9ea8447"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
|
||||
|
@ -1,6 +1,6 @@
|
||||
---
|
||||
variant: fcos
|
||||
version: 1.1.0
|
||||
version: 1.4.0
|
||||
systemd:
|
||||
units:
|
||||
- name: etcd-member.service
|
||||
@ -12,7 +12,7 @@ systemd:
|
||||
Wants=network-online.target network.target
|
||||
After=network-online.target
|
||||
[Service]
|
||||
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.4.14
|
||||
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.5.0
|
||||
Type=exec
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/etcd
|
||||
ExecStartPre=-/usr/bin/podman rm etcd
|
||||
@ -52,9 +52,9 @@ systemd:
|
||||
After=afterburn.service
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
EnvironmentFile=/run/metadata/afterburn
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
@ -65,12 +65,12 @@ systemd:
|
||||
--privileged \
|
||||
--pid host \
|
||||
--network host \
|
||||
--volume /etc/cni/net.d:/etc/cni/net.d:ro,z \
|
||||
--volume /etc/kubernetes:/etc/kubernetes:ro,z \
|
||||
--volume /usr/lib/os-release:/etc/os-release:ro \
|
||||
--volume /lib/modules:/lib/modules:ro \
|
||||
--volume /run:/run \
|
||||
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \
|
||||
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
|
||||
--volume /sys/fs/cgroup:/sys/fs/cgroup \
|
||||
--volume /var/lib/calico:/var/lib/calico:ro \
|
||||
--volume /var/lib/docker:/var/lib/docker \
|
||||
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
|
||||
@ -88,7 +88,6 @@ systemd:
|
||||
--client-ca-file=/etc/kubernetes/ca.crt \
|
||||
--cluster_dns=${cluster_dns_service_ip} \
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--healthz-port=0 \
|
||||
--hostname-override=$${AFTERBURN_DIGITALOCEAN_IPV4_PRIVATE_0} \
|
||||
--kubeconfig=/var/lib/kubelet/kubeconfig \
|
||||
@ -130,7 +129,7 @@ systemd:
|
||||
--volume /opt/bootstrap/assets:/assets:ro,Z \
|
||||
--volume /opt/bootstrap/apply:/apply:ro,Z \
|
||||
--entrypoint=/apply \
|
||||
quay.io/poseidon/kubelet:v1.20.0
|
||||
quay.io/poseidon/kubelet:v1.22.1
|
||||
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
|
||||
ExecStartPost=-/usr/bin/podman stop bootstrap
|
||||
storage:
|
||||
|
@ -1,6 +1,6 @@
|
||||
---
|
||||
variant: fcos
|
||||
version: 1.1.0
|
||||
version: 1.4.0
|
||||
systemd:
|
||||
units:
|
||||
- name: docker.service
|
||||
@ -26,9 +26,9 @@ systemd:
|
||||
After=afterburn.service
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
EnvironmentFile=/run/metadata/afterburn
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
@ -39,12 +39,12 @@ systemd:
|
||||
--privileged \
|
||||
--pid host \
|
||||
--network host \
|
||||
--volume /etc/cni/net.d:/etc/cni/net.d:ro,z \
|
||||
--volume /etc/kubernetes:/etc/kubernetes:ro,z \
|
||||
--volume /usr/lib/os-release:/etc/os-release:ro \
|
||||
--volume /lib/modules:/lib/modules:ro \
|
||||
--volume /run:/run \
|
||||
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \
|
||||
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
|
||||
--volume /sys/fs/cgroup:/sys/fs/cgroup \
|
||||
--volume /var/lib/calico:/var/lib/calico:ro \
|
||||
--volume /var/lib/docker:/var/lib/docker \
|
||||
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
|
||||
@ -62,7 +62,6 @@ systemd:
|
||||
--client-ca-file=/etc/kubernetes/ca.crt \
|
||||
--cluster_dns=${cluster_dns_service_ip} \
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--healthz-port=0 \
|
||||
--hostname-override=$${AFTERBURN_DIGITALOCEAN_IPV4_PRIVATE_0} \
|
||||
--kubeconfig=/var/lib/kubelet/kubeconfig \
|
||||
@ -93,7 +92,7 @@ systemd:
|
||||
[Unit]
|
||||
Description=Delete Kubernetes node on shutdown
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
Type=oneshot
|
||||
RemainAfterExit=true
|
||||
ExecStart=/bin/true
|
||||
|
@ -116,7 +116,7 @@ resource "digitalocean_firewall" "controllers" {
|
||||
# kube-scheduler metrics, kube-controller-manager metrics
|
||||
inbound_rule {
|
||||
protocol = "tcp"
|
||||
port_range = "10251-10252"
|
||||
port_range = "10257-10259"
|
||||
source_tags = [digitalocean_tag.workers.name]
|
||||
}
|
||||
}
|
||||
|
@ -36,6 +36,7 @@ resource "null_resource" "copy-controller-secrets" {
|
||||
provisioner "remote-exec" {
|
||||
inline = [
|
||||
"sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
|
||||
"sudo touch /etc/kubernetes",
|
||||
"sudo /opt/bootstrap/layout",
|
||||
]
|
||||
}
|
||||
@ -60,6 +61,7 @@ resource "null_resource" "copy-worker-secrets" {
|
||||
provisioner "remote-exec" {
|
||||
inline = [
|
||||
"sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
|
||||
"sudo touch /etc/kubernetes",
|
||||
]
|
||||
}
|
||||
}
|
||||
@ -84,4 +86,3 @@ resource "null_resource" "bootstrap" {
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -48,13 +48,13 @@ variable "os_image" {
|
||||
|
||||
variable "controller_snippets" {
|
||||
type = list(string)
|
||||
description = "Controller Fedora CoreOS Config snippets"
|
||||
description = "Controller Butane snippets"
|
||||
default = []
|
||||
}
|
||||
|
||||
variable "worker_snippets" {
|
||||
type = list(string)
|
||||
description = "Worker Fedora CoreOS Config snippets"
|
||||
description = "Worker Butane snippets"
|
||||
default = []
|
||||
}
|
||||
|
||||
|
@ -1,14 +1,14 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = "~> 0.13.0"
|
||||
required_version = ">= 0.13.0, < 2.0.0"
|
||||
required_providers {
|
||||
template = "~> 2.1"
|
||||
null = "~> 2.1"
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6"
|
||||
version = "~> 0.9"
|
||||
}
|
||||
|
||||
digitalocean = {
|
||||
|
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.20.0 (upstream)
|
||||
* Kubernetes v1.22.1 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=4edd79dd0295e6ffa4c8ed04fd5914d5cb3f1b7c"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d7fd3f62661def56e231602a3b101ff5e9ea8447"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
|
||||
|
@ -10,7 +10,7 @@ systemd:
|
||||
Requires=docker.service
|
||||
After=docker.service
|
||||
[Service]
|
||||
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.4.14
|
||||
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.5.0
|
||||
ExecStartPre=/usr/bin/docker run -d \
|
||||
--name etcd \
|
||||
--network host \
|
||||
@ -65,9 +65,9 @@ systemd:
|
||||
After=coreos-metadata.service
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
EnvironmentFile=/run/metadata/coreos
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
@ -78,6 +78,7 @@ systemd:
|
||||
--privileged \
|
||||
--pid host \
|
||||
--network host \
|
||||
-v /etc/cni/net.d:/etc/cni/net.d:ro \
|
||||
-v /etc/kubernetes:/etc/kubernetes:ro \
|
||||
-v /etc/machine-id:/etc/machine-id:ro \
|
||||
-v /usr/lib/os-release:/etc/os-release:ro \
|
||||
@ -98,7 +99,6 @@ systemd:
|
||||
--client-ca-file=/etc/kubernetes/ca.crt \
|
||||
--cluster_dns=${cluster_dns_service_ip} \
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--healthz-port=0 \
|
||||
--hostname-override=$${COREOS_DIGITALOCEAN_IPV4_PRIVATE_0} \
|
||||
--kubeconfig=/var/lib/kubelet/kubeconfig \
|
||||
@ -127,7 +127,7 @@ systemd:
|
||||
Type=oneshot
|
||||
RemainAfterExit=true
|
||||
WorkingDirectory=/opt/bootstrap
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
ExecStart=/usr/bin/docker run \
|
||||
-v /etc/kubernetes/pki:/etc/kubernetes/pki:ro \
|
||||
-v /opt/bootstrap/assets:/assets:ro \
|
||||
|
@ -37,9 +37,9 @@ systemd:
|
||||
After=coreos-metadata.service
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
EnvironmentFile=/run/metadata/coreos
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
@ -53,6 +53,7 @@ systemd:
|
||||
--privileged \
|
||||
--pid host \
|
||||
--network host \
|
||||
-v /etc/cni/net.d:/etc/cni/net.d:ro \
|
||||
-v /etc/kubernetes:/etc/kubernetes:ro \
|
||||
-v /etc/machine-id:/etc/machine-id:ro \
|
||||
-v /usr/lib/os-release:/etc/os-release:ro \
|
||||
@ -73,7 +74,6 @@ systemd:
|
||||
--client-ca-file=/etc/kubernetes/ca.crt \
|
||||
--cluster_dns=${cluster_dns_service_ip} \
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--healthz-port=0 \
|
||||
--hostname-override=$${COREOS_DIGITALOCEAN_IPV4_PRIVATE_0} \
|
||||
--kubeconfig=/var/lib/kubelet/kubeconfig \
|
||||
@ -96,7 +96,7 @@ systemd:
|
||||
[Unit]
|
||||
Description=Delete Kubernetes node on shutdown
|
||||
[Service]
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.20.0
|
||||
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.22.1
|
||||
Type=oneshot
|
||||
RemainAfterExit=true
|
||||
ExecStart=/bin/true
|
||||
|
@ -116,7 +116,7 @@ resource "digitalocean_firewall" "controllers" {
|
||||
# kube-scheduler metrics, kube-controller-manager metrics
|
||||
inbound_rule {
|
||||
protocol = "tcp"
|
||||
port_range = "10251-10252"
|
||||
port_range = "10257-10259"
|
||||
source_tags = [digitalocean_tag.workers.name]
|
||||
}
|
||||
}
|
||||
|
@ -1,14 +1,14 @@
|
||||
# Terraform version and plugin versions
|
||||
|
||||
terraform {
|
||||
required_version = "~> 0.13.0"
|
||||
required_version = ">= 0.13.0, < 2.0.0"
|
||||
required_providers {
|
||||
template = "~> 2.1"
|
||||
null = "~> 2.1"
|
||||
|
||||
ct = {
|
||||
source = "poseidon/ct"
|
||||
version = "~> 0.6"
|
||||
version = "~> 0.9"
|
||||
}
|
||||
|
||||
digitalocean = {
|
||||
|
@ -6,7 +6,7 @@
|
||||
Typhoon has experimental support for ARM64 with Fedora CoreOS on AWS. Full clusters can be created with ARM64 controller and worker nodes. Or worker pools of ARM64 nodes can be attached to an AMD64 cluster to create a hybrid/mixed architecture cluster.
|
||||
|
||||
!!! note
|
||||
Currently, CNI networking must be set to flannel.
|
||||
Currently, CNI networking must be set to flannel or Cilium.
|
||||
|
||||
## AMIs
|
||||
|
||||
@ -21,7 +21,7 @@ Create a cluster with ARM64 controller and worker nodes. Container workloads mus
|
||||
|
||||
```tf
|
||||
module "gravitas" {
|
||||
source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.19.4"
|
||||
source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.22.1"
|
||||
|
||||
# AWS
|
||||
cluster_name = "gravitas"
|
||||
@ -29,11 +29,11 @@ module "gravitas" {
|
||||
dns_zone_id = "Z3PAABBCFAKEC0"
|
||||
|
||||
# configuration
|
||||
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
|
||||
ssh_authorized_key = "ssh-ed25519 AAAAB3Nz..."
|
||||
|
||||
# optional
|
||||
arch = "arm64"
|
||||
networking = "flannel"
|
||||
networking = "cilium"
|
||||
worker_count = 2
|
||||
worker_price = "0.0168"
|
||||
|
||||
@ -47,9 +47,9 @@ Verify the cluster has only arm64 (`aarch64`) nodes.
|
||||
```
|
||||
$ kubectl get nodes -o wide
|
||||
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
|
||||
ip-10-0-12-178 Ready <none> 101s v1.19.4 10.0.12.178 <none> Fedora CoreOS 32.20201104.dev.0 5.8.17-200.fc32.aarch64 docker://19.3.11
|
||||
ip-10-0-18-93 Ready <none> 102s v1.19.4 10.0.18.93 <none> Fedora CoreOS 32.20201104.dev.0 5.8.17-200.fc32.aarch64 docker://19.3.11
|
||||
ip-10-0-90-10 Ready <none> 104s v1.19.4 10.0.90.10 <none> Fedora CoreOS 32.20201104.dev.0 5.8.17-200.fc32.aarch64 docker://19.3.11
|
||||
ip-10-0-12-178 Ready <none> 101s v1.22.1 10.0.12.178 <none> Fedora CoreOS 32.20201104.dev.0 5.8.17-200.fc32.aarch64 docker://19.3.11
|
||||
ip-10-0-18-93 Ready <none> 102s v1.22.1 10.0.18.93 <none> Fedora CoreOS 32.20201104.dev.0 5.8.17-200.fc32.aarch64 docker://19.3.11
|
||||
ip-10-0-90-10 Ready <none> 104s v1.22.1 10.0.90.10 <none> Fedora CoreOS 32.20201104.dev.0 5.8.17-200.fc32.aarch64 docker://19.3.11
|
||||
```
|
||||
|
||||
## Hybrid
|
||||
@ -60,7 +60,7 @@ Create a hybrid/mixed arch cluster by defining an AWS cluster. Then define a [wo
|
||||
|
||||
```tf
|
||||
module "gravitas" {
|
||||
source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.19.4"
|
||||
source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.22.1"
|
||||
|
||||
# AWS
|
||||
cluster_name = "gravitas"
|
||||
@ -68,10 +68,10 @@ Create a hybrid/mixed arch cluster by defining an AWS cluster. Then define a [wo
|
||||
dns_zone_id = "Z3PAABBCFAKEC0"
|
||||
|
||||
# configuration
|
||||
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
|
||||
ssh_authorized_key = "ssh-ed25519 AAAAB3Nz..."
|
||||
|
||||
# optional
|
||||
networking = "flannel"
|
||||
networking = "cilium"
|
||||
worker_count = 2
|
||||
worker_price = "0.021"
|
||||
|
||||
@ -83,7 +83,7 @@ Create a hybrid/mixed arch cluster by defining an AWS cluster. Then define a [wo
|
||||
|
||||
```tf
|
||||
module "gravitas-arm64" {
|
||||
source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes/workers?ref=v1.19.4"
|
||||
source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes/workers?ref=v1.22.1"
|
||||
|
||||
# AWS
|
||||
vpc_id = module.gravitas.vpc_id
|
||||
@ -107,10 +107,10 @@ Verify amd64 (x86_64) and arm64 (aarch64) nodes are present.
|
||||
|
||||
```
|
||||
$ kubectl get nodes -o wide
|
||||
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
|
||||
ip-10-0-14-73 Ready <none> 116s v1.19.4 10.0.14.73 <none> Fedora CoreOS 32.20201018.3.0 5.8.15-201.fc32.x86_64 docker://19.3.11
|
||||
ip-10-0-17-167 Ready <none> 104s v1.19.4 10.0.17.167 <none> Fedora CoreOS 32.20201018.3.0 5.8.15-201.fc32.x86_64 docker://19.3.11
|
||||
ip-10-0-47-166 Ready <none> 110s v1.19.4 10.0.47.166 <none> Fedora CoreOS 32.20201104.dev.0 5.8.17-200.fc32.aarch64 docker://19.3.11
|
||||
ip-10-0-7-237 Ready <none> 111s v1.19.4 10.0.7.237 <none> Fedora CoreOS 32.20201018.3.0 5.8.15-201.fc32.x86_64 docker://19.3.11
|
||||
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
|
||||
ip-10-0-1-81 Ready <none> 4m28s v1.22.1 10.0.1.81 <none> Fedora CoreOS 34.20210427.3.0 5.11.15-300.fc34.x86_64 docker://20.10.6
|
||||
ip-10-0-17-86 Ready <none> 4m28s v1.22.1 10.0.17.86 <none> Fedora CoreOS 33.20210413.dev.0 5.10.19-200.fc33.aarch64 docker://19.3.13
|
||||
ip-10-0-21-45 Ready <none> 4m28s v1.22.1 10.0.21.45 <none> Fedora CoreOS 34.20210427.3.0 5.11.15-300.fc34.x86_64 docker://20.10.6
|
||||
ip-10-0-40-36 Ready <none> 4m22s v1.22.1 10.0.40.36 <none> Fedora CoreOS 34.20210427.3.0 5.11.15-300.fc34.x86_64 docker://20.10.6
|
||||
```
|
||||
|
||||
|
@ -12,9 +12,9 @@ Clusters are kept to a minimal Kubernetes control plane by offering components l
|
||||
|
||||
## Hosts
|
||||
|
||||
Typhoon uses the [Ignition](https://github.com/coreos/ignition) system of Fedora CoreOS and Flatcar Linux to immutably declare a system via first-boot disk provisioning. Fedora CoreOS uses a [Fedora CoreOS Config](https://docs.fedoraproject.org/en-US/fedora-coreos/fcct-config/) (FCC) and Flatcar Linux uses a [Container Linux Config](https://github.com/coreos/container-linux-config-transpiler/blob/master/doc/examples.md) (CLC). These define disk partitions, filesystems, systemd units, dropins, config files, mount units, raid arrays, and users.
|
||||
Typhoon uses the [Ignition](https://github.com/coreos/ignition) system of Fedora CoreOS and Flatcar Linux to immutably declare a system via first-boot disk provisioning. Fedora CoreOS uses a [Butane Config](https://coreos.github.io/butane/specs/) and Flatcar Linux uses a [Container Linux Config](https://github.com/coreos/container-linux-config-transpiler/blob/master/doc/examples.md) (CLC). These define disk partitions, filesystems, systemd units, dropins, config files, mount units, raid arrays, and users.
|
||||
|
||||
Controller and worker instances form a minimal and secure Kubernetes cluster on each platform. Typhoon provides the **snippets** feature to accept Fedora CoreOS Configs or Container Linux Configs to validate and additively merge into instance declarations. This allows advanced host customization and experimentation.
|
||||
Controller and worker instances form a minimal and secure Kubernetes cluster on each platform. Typhoon provides the **snippets** feature to accept Butane or Container Linux Configs to validate and additively merge into instance declarations. This allows advanced host customization and experimentation.
|
||||
|
||||
!!! note
|
||||
Snippets cannot be used to modify an already existing instance, the antithesis of immutable provisioning. Ignition fully declares a system on first boot only.
|
||||
@ -30,14 +30,14 @@ Controller and worker instances form a minimal and secure Kubernetes cluster on
|
||||
!!! note
|
||||
Fedora CoreOS snippets require `terraform-provider-ct` v0.5+
|
||||
|
||||
Define a Fedora CoreOS Config (FCC) ([docs](https://docs.fedoraproject.org/en-US/fedora-coreos/fcct-config/), [config](https://github.com/coreos/fcct/blob/master/docs/configuration-v1_0.md), [examples](https://github.com/coreos/fcct/blob/master/docs/examples.md)) in version control near your Terraform workspace directory (e.g. perhaps in a `snippets` subdirectory). You may organize snippets into multiple files, if desired.
|
||||
Define a Butane Config ([docs](https://coreos.github.io/butane/specs/), [config](https://github.com/coreos/butane/blob/main/docs/config-fcos-v1_4.md)) in version control near your Terraform workspace directory (e.g. perhaps in a `snippets` subdirectory). You may organize snippets into multiple files, if desired.
|
||||
|
||||
For example, ensure an `/opt/hello` file is created with permissions 0644.
|
||||
|
||||
```yaml
|
||||
# custom-files
|
||||
variant: fcos
|
||||
version: 1.1.0
|
||||
version: 1.4.0
|
||||
storage:
|
||||
files:
|
||||
- path: /opt/hello
|
||||
@ -185,7 +185,7 @@ To set an alternative etcd image or Kubelet image, use a snippet to set a system
|
||||
```yaml
|
||||
# kubelet-image-override.yaml
|
||||
variant: fcos <- remove for Flatcar Linux
|
||||
version: 1.1.0 <- remove for Flatcar Linux
|
||||
version: 1.4.0 <- remove for Flatcar Linux
|
||||
systemd:
|
||||
units:
|
||||
- name: kubelet.service
|
||||
@ -201,7 +201,7 @@ To set an alternative etcd image or Kubelet image, use a snippet to set a system
|
||||
```yaml
|
||||
# etcd-image-override.yaml
|
||||
variant: fcos <- remove for Flatcar Linux
|
||||
version: 1.1.0 <- remove for Flatcar Linux
|
||||
version: 1.4.0 <- remove for Flatcar Linux
|
||||
systemd:
|
||||
units:
|
||||
- name: etcd-member.service
|
||||
|
134
docs/advanced/nodes.md
Normal file
134
docs/advanced/nodes.md
Normal file
@ -0,0 +1,134 @@
|
||||
# Nodes
|
||||
|
||||
Typhoon clusters consist of controller node(s) and a (default) set of worker nodes.
|
||||
|
||||
## Overview
|
||||
|
||||
Typhoon nodes use the standard set of Kubernetes node labels.
|
||||
|
||||
```yaml
|
||||
Labels: kubernetes.io/arch=amd64
|
||||
kubernetes.io/hostname=node-name
|
||||
kubernetes.io/os=linux
|
||||
```
|
||||
|
||||
Controller node(s) are labeled to allow node selection (for rare components that run on controllers) and tainted to prevent ordinary workloads running on controllers.
|
||||
|
||||
```yaml
|
||||
Labels: node.kubernetes.io/controller=true
|
||||
Taints: node-role.kubernetes.io/controller:NoSchedule
|
||||
```
|
||||
|
||||
Worker nodes are labeled to allow node selection and untainted. Workloads will schedule on worker nodes by default, baring any contraindications.
|
||||
|
||||
```yaml
|
||||
Labels: node.kubernetes.io/node=
|
||||
Taints: <none>
|
||||
```
|
||||
|
||||
On auto-scaling cloud platforms, you may add [worker pools](/advanced/worker-pools/) with different groups of nodes with their own labels and taints. On platforms like bare-metal, with heterogeneous machines, you may manage node labels and taints per node.
|
||||
|
||||
## Node Labels
|
||||
|
||||
Add custom initial worker node labels to default workers or worker pool nodes to allow workloads to select among nodes that differ.
|
||||
|
||||
=== "Cluster"
|
||||
|
||||
```tf
|
||||
module "yavin" {
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.22.1"
|
||||
|
||||
# Google Cloud
|
||||
cluster_name = "yavin"
|
||||
region = "us-central1"
|
||||
dns_zone = "example.com"
|
||||
dns_zone_name = "example-zone"
|
||||
|
||||
# configuration
|
||||
ssh_authorized_key = local.ssh_key
|
||||
|
||||
# optional
|
||||
worker_count = 2
|
||||
worker_node_labels = ["pool=default"]
|
||||
}
|
||||
```
|
||||
|
||||
=== "Worker Pool"
|
||||
|
||||
```tf
|
||||
module "yavin-pool" {
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes/workers?ref=v1.22.1"
|
||||
|
||||
# Google Cloud
|
||||
cluster_name = "yavin"
|
||||
region = "europe-west2"
|
||||
network = module.yavin.network_name
|
||||
|
||||
# configuration
|
||||
name = "yavin-16x"
|
||||
kubeconfig = module.yavin.kubeconfig
|
||||
ssh_authorized_key = local.ssh_key
|
||||
|
||||
# optional
|
||||
worker_count = 1
|
||||
machine_type = "n1-standard-16"
|
||||
node_labels = ["pool=big"]
|
||||
}
|
||||
```
|
||||
|
||||
In the example above, the two default workers would be labeled `pool: default` and the additional worker would be labeled `pool: big`.
|
||||
|
||||
## Node Taints
|
||||
|
||||
Add custom initial taints on worker pool nodes to indicate a node is unique and should only schedule workloads that explicitly tolerate a given taint key.
|
||||
|
||||
!!! warning
|
||||
Since taints prevent workloads scheduling onto a node, you must decide whether `kube-system` DaemonSets (e.g. flannel, Calico, Cilium) should tolerate your custom taint by setting `daemonset_tolerations`. If you don't list your custom taint(s), important components won't run on these nodes.
|
||||
|
||||
=== "Cluster"
|
||||
|
||||
```tf
|
||||
module "yavin" {
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.22.1"
|
||||
|
||||
# Google Cloud
|
||||
cluster_name = "yavin"
|
||||
region = "us-central1"
|
||||
dns_zone = "example.com"
|
||||
dns_zone_name = "example-zone"
|
||||
|
||||
# configuration
|
||||
ssh_authorized_key = local.ssh_key
|
||||
|
||||
# optional
|
||||
worker_count = 2
|
||||
daemonset_tolerations = ["role"]
|
||||
}
|
||||
```
|
||||
|
||||
=== "Worker Pool"
|
||||
|
||||
```tf
|
||||
module "yavin-pool" {
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes/workers?ref=v1.22.1"
|
||||
|
||||
# Google Cloud
|
||||
cluster_name = "yavin"
|
||||
region = "europe-west2"
|
||||
network = module.yavin.network_name
|
||||
|
||||
# configuration
|
||||
name = "yavin-16x"
|
||||
kubeconfig = module.yavin.kubeconfig
|
||||
ssh_authorized_key = local.ssh_key
|
||||
|
||||
# optional
|
||||
worker_count = 1
|
||||
accelerator_type = "nvidia-tesla-p100"
|
||||
accelerator_count = 1
|
||||
node_taints = ["role=gpu:NoSchedule"]
|
||||
}
|
||||
```
|
||||
|
||||
In the example above, the the additional worker would be tainted with `role=gpu:NoSchedule` to prevent workloads scheduling, but `kube-system` components like flannel, Calico, or Cilium would tolerate that custom taint to run there.
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user