Compare commits

...

144 Commits

Author SHA1 Message Date
d51da49925 Update docs for Kubernetes v1.21.1 and Terraform v0.15.x 2021-05-13 11:34:01 -07:00
2076a779a3 Update Kubernetes from v1.21.0 to v1.21.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md#v1211
2021-05-13 11:23:26 -07:00
048094b256 Update etcd from v3.4.15 to v3.4.16
* https://github.com/etcd-io/etcd/blob/main/CHANGELOG-3.4.md
2021-05-13 10:53:04 -07:00
75b063c586 Update Prometheus from v2.25.2 to v2.27.0
* Update Grafana from v7.5.4 to v7.5.6
* https://github.com/prometheus/prometheus/releases/tag/v2.27.0
* https://github.com/grafana/grafana/releases/tag/v7.5.6
2021-05-12 11:47:07 -07:00
1620d1e456 Bump mkdocs-material from 7.1.3 to 7.1.4
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.1.3 to 7.1.4.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.1.3...7.1.4)

Signed-off-by: dependabot[bot] <support@github.com>
2021-05-10 14:53:17 -07:00
939bffbf98 Bump pymdown-extensions from 8.1.1 to 8.2
Bumps [pymdown-extensions](https://github.com/facelessuser/pymdown-extensions) from 8.1.1 to 8.2.
- [Release notes](https://github.com/facelessuser/pymdown-extensions/releases)
- [Commits](https://github.com/facelessuser/pymdown-extensions/compare/8.1.1...8.2)

Signed-off-by: dependabot[bot] <support@github.com>
2021-05-10 14:52:58 -07:00
bc96443710 Update nginx-ingress from v0.45.0 to v0.46.0
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.46.0
2021-05-05 12:06:20 -07:00
82a7422b3d Change Dependabot pip watcher to check weekly 2021-05-05 11:34:57 -07:00
132ab395a5 Bump pygments from 2.8.1 to 2.9.0
Bumps [pygments](https://github.com/pygments/pygments) from 2.8.1 to 2.9.0.
- [Release notes](https://github.com/pygments/pygments/releases)
- [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES)
- [Commits](https://github.com/pygments/pygments/compare/2.8.1...2.9.0)

Signed-off-by: dependabot[bot] <support@github.com>
2021-05-05 11:32:02 -07:00
5f87eb3ec9 Update Fedora CoreOS Kubelet for cgroups v2
* Fedora CoreOS is beginning to switch from cgroups v1 to
cgroups v2 by default, which changes the sysfs hierarchy
* This will be needed when using a Fedora Coreos OS image
that enables cgroups v2 (`next` stream as of this writing)

Rel: https://github.com/coreos/fedora-coreos-tracker/issues/292
2021-04-26 11:48:58 -07:00
b152b9f973 Reduce the default disk_size from 40GB to 30GB
* We're typically reducing the `disk_size` in real clusters
since the space is under used. The default should be lower.
2021-04-26 11:43:26 -07:00
9c842395a8 Update Cilium from v1.9.5 to v1.9.6
* https://github.com/cilium/cilium/releases/tag/v1.9.6
2021-04-26 10:55:23 -07:00
6cb9c0341b Bump mkdocs-material from 7.1.2 to 7.1.3
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.1.2 to 7.1.3.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.1.2...7.1.3)

Signed-off-by: dependabot[bot] <support@github.com>
2021-04-26 10:35:00 -07:00
d4fd6d4adb Bump mkdocs-material from 7.1.1 to 7.1.2
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.1.1 to 7.1.2.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.1.1...7.1.2)

Signed-off-by: dependabot[bot] <support@github.com>
2021-04-23 14:26:27 -07:00
3664dfafc2 Update docs with video meetings and referral links
* Use our DigitalOcean referral code for new DigitalOcean
users. This gives new accounts free cloud credits and
provides a smaller cloud credit back to the project
* Link to the new video meeting via one-time Github Sponsor
feature that we're trying out
* List Fedora CoreOS ARM64 as a supported platform (alpha).
Before this was only mentioned in docs and on the blog.
2021-04-17 19:15:51 -07:00
e535ddd15a Update Grafana from v7.5.3 to v7.5.4
* https://github.com/grafana/grafana/releases/tag/v7.5.4
2021-04-17 11:38:14 -07:00
5752a8f041 Update kube-state-metrics from v2.0.0-rc.1 to v2.0.0
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0
2021-04-17 11:34:52 -07:00
68abbf7b0d Fix docs link on index page (#975)
* Fix Fedora CoreOS Google Cloud tutorial link
2021-04-17 10:52:59 -07:00
67047ead08 Update Terraform version to allow v0.15.0
* Require Terraform version v0.13 <= x < v0.16
2021-04-16 09:46:01 -07:00
c11e23fc50 Fix minor docs issues and missing changelog links 2021-04-13 09:35:11 -07:00
b647ad8806 Bump mkdocs-material from 7.1.0 to 7.1.1
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.1.0 to 7.1.1.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.1.0...7.1.1)

Signed-off-by: dependabot[bot] <support@github.com>
2021-04-12 20:29:01 -07:00
2eb1ac1b4d Update nginx-ingress from v0.44.0 to v0.45.0
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.45.0
2021-04-12 00:18:47 -07:00
cb2721ef7d Update Grafana from v7.5.2 to v7.5.3
* https://github.com/grafana/grafana/releases/tag/v7.5.3
2021-04-12 00:17:22 -07:00
fc06d28e13 Remove deprecated field on azurerm_lb_backend_address_pool
* Remove the deprecated `resource_group_name` field from Azure
`azurerm_lb_backend_address_pool` resources
2021-04-11 23:59:17 -07:00
a9078cb52b Add sponsorship badge to Github repo 2021-04-11 16:00:16 -07:00
ebd9570ede Update Fedora CoreOS Config version from v1.1.0 to v1.2.0
* Require [poseidon/ct](https://github.com/poseidon/terraform-provider-ct)
Terraform provider v0.8+
* Require any [snippets](https://typhoon.psdn.io/advanced/customization/#hosts)
customizations to update to v1.2.0

See upgrade [notes](https://typhoon.psdn.io/topics/maintenance/#upgrade-terraform-provider-ct)
2021-04-11 15:26:54 -07:00
34e8db7aae Update static Pod manifests for Kubernetes v1.21.0
* https://github.com/poseidon/terraform-render-bootstrap/pull/257
2021-04-11 15:05:46 -07:00
084e8bea49 Allow custom initial node taints on worker pool nodes
* Add `node_taints` variable to worker modules to set custom
initial node taints on cloud platforms that support auto-scaling
worker pools of heterogeneous nodes (i.e. AWS, Azure, GCP)
* Worker pools could use custom `node_labels` to allowed workloads
to select among differentiated nodes, while custom `node_taints`
allows a worker pool's nodes to be tainted as special to prevent
scheduling, except by workloads that explicitly tolerate the
taint
* Expose `daemonset_tolerations` in AWS, Azure, and GCP kubernetes
cluster modules, to determine whether `kube-system` components
should tolerate the custom taint (advanced use covered in docs)

Rel: #550, #663
Closes #429
2021-04-11 15:00:11 -07:00
d73621c838 Update Kubernetes from v1.20.5 to v1.21.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md#v1210
2021-04-08 21:44:31 -07:00
1a6481df04 Update Grafana from v7.5.1 to v7.5.2
* https://github.com/grafana/grafana/releases/tag/v7.5.2
2021-04-04 18:20:02 -07:00
798ec9a92f Change CNI config directory to /etc/cni/net.d
* Change CNI config directory from `/etc/kubernetes/cni/net.d`
to `/etc/cni/net.d` (Kubelet default)
* https://github.com/poseidon/terraform-render-bootstrap/pull/255
2021-04-02 00:03:48 -07:00
96aed4c3c3 Bump mkdocs-material from 7.0.6 to 7.1.0
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 7.0.6 to 7.1.0.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/docs/changelog.md)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/7.0.6...7.1.0)

Signed-off-by: dependabot[bot] <support@github.com>
2021-04-02 00:01:44 -07:00
7372d33af8 Update kube-state-metrics and Grafana
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-rc.1
* https://github.com/grafana/grafana/releases/tag/v7.5.1
2021-03-28 10:53:52 -07:00
451ec771a8 Update Terraform providers and CHANGES for release 2021-03-23 08:45:57 -07:00
4d9846b83e Add DigitalOcean as a OSS sponsorship partner
* Include DigitalOcean logo and link on repo and site
2021-03-21 11:34:36 -07:00
597ca4acce Update CoreDNS from v1.7.0 to v1.8.0
* https://github.com/poseidon/terraform-render-bootstrap/pull/254
2021-03-20 16:47:25 -07:00
507c646e8b Add Kubelet provider-id on AWS
* Set the Kubelet `--provider-id` on AWS based on metadata from
Fedora CoreOS afterburn or Flatcar Linux coreos-metadata
* Based on https://github.com/poseidon/typhoon/pull/951
2021-03-19 12:43:37 -07:00
d8f7da6873 Add dependabot update watcher for docs pypi packages
* Update requirements.txt packages for mkdocs
2021-03-19 11:55:54 -07:00
048f1f514e Update Grafana from v7.4.3 to v7.4.5
* https://github.com/grafana/grafana/releases/tag/v7.4.5
2021-03-19 11:51:52 -07:00
b825cd9afe Update Prometheus from v2.25.1 to v2.25.2
* https://github.com/prometheus/prometheus/releases/tag/v2.25.2
2021-03-19 11:49:38 -07:00
796149d122 Update Kubernetes from v1.20.4 to v1.20.5
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1205
2021-03-19 11:27:31 -07:00
a66bccd590 Update Cilium from v1.9.4 to v1.9.5
* https://github.com/cilium/cilium/releases/tag/v1.9.5
2021-03-14 11:48:22 -07:00
30b1edfcc6 Mark bootstrap token as sensitive in plan/apply
* Mark the bootstrap token as sensitive, which is useful when
Terraform is run in automated CI/CD systems to avoid showing
the token
* https://github.com/poseidon/terraform-render-bootstrap/pull/251
2021-03-14 11:32:35 -07:00
a4afe06b64 Update Calico from v3.17.3 to v3.18.1
* https://docs.projectcalico.org/archive/v3.18/release-notes/
2021-03-14 10:35:24 -07:00
4d58be0816 Update Prometheus from v2.25.0 to v2.25.1
* https://github.com/prometheus/prometheus/releases/tag/v2.25.1
2021-03-14 09:43:15 -07:00
170b768ad8 Add KUBELET_IMAGE to Fedora CoreOS bootstrap.service (#945)
* Align with Flatcar Linux `bootstrap.service`
2021-03-14 09:35:42 -07:00
5bc1cd28c3 Switch kube-state-metrics image from quay to k8s.gcr.io
* kube-state-metrics is continuing publishing container images
to `k8s.gcr.io` instead of `quay.io`

Rel: https://github.com/kubernetes/kube-state-metrics/issues/1409
2021-03-11 10:56:18 -08:00
13fbac6c79 Update Grafana from v7.4.2 to v7.4.3
* https://github.com/grafana/grafana/releases/tag/v7.4.3
2021-03-05 17:19:54 -08:00
a8fa4a9a06 Update node-exporter and kube-state-metrics
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-rc.0
* https://github.com/prometheus/node_exporter/releases/tag/v1.1.2
2021-03-05 17:13:45 -08:00
a5c1a96df1 Update etcd from v3.4.14 to v3.4.15
* https://github.com/etcd-io/etcd/releases/tag/v3.4.15
2021-03-05 17:02:57 -08:00
6a091e245e Remove Flatcar Linux Edge os_image option
* Flatcar Linux has not published an Edge channel image since
April 2020 and recently removed mention of the channel from
their documentation https://github.com/kinvolk/Flatcar/pull/345
* Users of Flatcar Linux Edge should move to the stable, beta, or
alpha channel, barring any alternate advice from upstream Flatcar
Linux
2021-02-20 16:09:54 -08:00
590796ee62 Update recommended Terraform provider versions
* Sync Terraform provider plugins with those used internally
2021-02-19 00:24:07 -08:00
ec389295fe Update Grafana from v7.4.0 to v7.4.2
* https://github.com/grafana/grafana/releases/tag/v7.4.2
2021-02-19 00:18:39 -08:00
3c807f3478 Update Prometheus from v2.24.1 to v2.25.0
* https://github.com/prometheus/prometheus/releases/tag/v2.25.0
2021-02-19 00:16:35 -08:00
e76fe80b45 Update Kubernetes from v1.20.3 to v1.20.4
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1204
2021-02-19 00:02:07 -08:00
32853aaa7b Update Kubernetes from v1.20.2 to v1.20.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1203
2021-02-17 22:29:33 -08:00
c32a54db40 Update node-exporter from v1.0.1 to v1.1.1
* https://github.com/prometheus/node_exporter/releases/tag/v1.1.1
2021-02-14 14:30:28 -08:00
9671b1c734 Update flannel-cni from v0.4.1 to v0.4.2
* https://github.com/poseidon/flannel-cni/releases/tag/v0.4.2
2021-02-14 12:04:59 -08:00
3b933e1ab3 Update Grafana from v7.3.7 to v7.4.0
* https://github.com/grafana/grafana/releases/tag/v7.4.0
2021-02-07 21:42:18 -08:00
58d8f6f505 Update Prometheus from v2.24.0 to v2.24.1
* https://github.com/prometheus/prometheus/releases/tag/v2.24.1
2021-02-04 22:28:32 -08:00
56853fe222 Update nginx-ingress from v0.43.0 to v0.44.0
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.44.0
2021-02-04 22:19:58 -08:00
18165d8076 Update Calico from v3.17.1 to v3.17.2
* https://github.com/projectcalico/calico/releases/tag/v3.17.2
2021-02-04 22:03:51 -08:00
50acf28ce5 Update Cilium from v1.9.3 to v1.9.4
* https://github.com/cilium/cilium/releases/tag/v1.9.4
2021-02-03 23:08:22 -08:00
ab793eb842 Update Cilium from v1.9.2 to v1.9.3
* https://github.com/cilium/cilium/releases/tag/v1.9.3
2021-01-26 17:13:52 -08:00
b74c958524 Update Cilium from v1.9.1 to v1.9.2
* https://github.com/cilium/cilium/releases/tag/v1.9.2
2021-01-20 22:06:45 -08:00
2024d3c32e Link to Github Sponsors in README and docs
* Update the Social Contract and Sponsors
2021-01-16 12:56:59 -08:00
11c434915f Update Grafana from v7.3.6 to v7.3.7
* https://github.com/grafana/grafana/releases/tag/v7.3.7
2021-01-16 10:46:56 -08:00
05f7df9e80 Update Kubernetes from v1.20.1 to v1.20.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1202
2021-01-13 17:46:51 -08:00
4220b9ce18 Add support for Terraform v0.14.4+
* Support Terraform v0.13.x and v0.14.4+
2021-01-12 21:43:12 -08:00
6a6af4aa16 Update Prometheus from v2.24.0-rc.0 to v2.24.0
* https://github.com/prometheus/prometheus/releases/tag/v2.24.0
2021-01-12 20:49:18 -08:00
3dcd10f3b8 Update Prometheus v2.23.0 to v2.24.0-rc.0
* https://github.com/prometheus/prometheus/releases/tag/v2.24.0-rc.0
2021-01-01 13:49:28 -08:00
22503993b9 Update nginx-ingress from v0.41.2 to v0.43.0
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.43.0
* https://github.com/kubernetes/ingress-nginx/issues/6696
2021-01-01 13:44:45 -08:00
cf3aa8885b Update Prometheus rules and Grafana dashboards
* Update Grafana from v7.3.5 to v7.3.6
2020-12-19 14:56:42 -08:00
ba61a137db Add notice about upstream Fedora CoreOS changes
* Highlight that short-term, use of Fedora CoreOS will
require non-RSA SSH keys or a workaround snippet
2020-12-19 14:10:42 -08:00
646bdd78e4 Update Kubernetes from v1.20.0 to v1.20.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1201
2020-12-19 12:56:28 -08:00
c163fbbbcd Update docs and README for release 2020-12-12 12:31:35 -08:00
dc7be431e0 Remove iSCSI mounts from Kubelet
* Remove Kubelet `/etc/iscsi` and `iscsiadm` host mounts that
were added on bare-metal, since these no longer work on either
Fedora CoreOS or Flatcar Linux with newer `iscsiadm`
* These special mounts on bare-metal date back to #350 which
added them to provide a way to use iSCSI in Kubernetes v1.10
* Today, storage should be handled by external CSI providers
which handle different storage systems, which doesn't rely
on Kubelet storage utils

Close #907
2020-12-12 11:41:02 -08:00
86e0f806b3 Revert "Add support for Terraform v0.14.x"
This reverts commit 968febb050.
2020-12-11 00:47:57 -08:00
96172ad269 Update Grafana from v7.3.4 to v7.3.5
* https://github.com/grafana/grafana/releases/tag/v7.3.5
2020-12-11 00:24:43 -08:00
3eb20a1f4b Update recommended Terraform provider versions
* Sync Terraform provider plugins with those used internally
2020-12-11 00:15:29 -08:00
ee9ce3d0ab Update Calico from v3.17.0 to v3.17.1
* https://github.com/projectcalico/calico/releases/tag/v3.17.1
2020-12-10 22:48:38 -08:00
a8b8a9b454 Update Kubernetes from v1.20.0-rc.0 to v1.20.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1200
2020-12-08 18:28:13 -08:00
968febb050 Add support for Terraform v0.14.x
* Support Terraform v0.13.x and v0.14.x
2020-12-07 00:22:38 -08:00
bee455f83a Update Cilium from v1.9.0 to v1.9.1
* https://github.com/cilium/cilium/releases/tag/v1.9.1
2020-12-04 14:14:18 -08:00
3e89ea1b4a Promote Fedora CoreOS bare-metal to stable
* Fedora CoreOS is a good choice for use on bare-metal
2020-12-04 14:02:55 -08:00
e77dd6ecd4 Update Kubernetes from v1.19.4 to v1.20.0-rc.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1200-rc0
2020-12-03 16:01:28 -08:00
4fd4a0f540 Move control plane static pod TLS assets to /etc/kubernetes/pki
* Change control plane static pods to mount `/etc/kubernetes/pki`,
instead of `/etc/kubernetes/bootstrap-secrets` to better reflect
their purpose and match some loose conventions upstream
* Place control plane and bootstrap TLS assets and kubeconfig's
in `/etc/kubernetes/pki`
* Mount to `/etc/kubernetes/pki` (rather than `/etc/kubernetes/secrets`)
to match the host location (less surprise)

Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/233
2020-12-02 23:26:42 -08:00
804dfea0f9 Add kubeconfig's for kube-scheduler and kube-controller-manager
* Generate TLS client certificates for `kube-scheduler` and
`kube-controller-manager` with `system:kube-scheduler` and
`system:kube-controller-manager` CNs
* Template separate kubeconfigs for kube-scheduler and
kube-controller manager (`scheduler.conf` and
`controller-manager.conf`). Rename admin for clarity
* Before v1.16.0, Typhoon scheduled a self-hosted control
plane, which allowed the steady-state kube-scheduler and
kube-controller-manager to use a scoped ServiceAccount.
With a static pod control plane, separate CN TLS client
certificates are the nearest equiv.
* https://kubernetes.io/docs/setup/best-practices/certificates/
* Remove unused Kubelet certificate, TLS bootstrap is used
instead
2020-12-01 22:02:15 -08:00
8ba23f364c Add TokenReview and TokenRequestProjection flags
* Add kube-apiserver flags for TokenReview and TokenRequestProjection
(beta, defaults on) to allow using Service Account Token Volume
Projection to create and mount service account tokens tied to a Pod's
lifecycle

Rel:

* https://github.com/poseidon/terraform-render-bootstrap/pull/231
* https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#service-account-token-volume-projection
2020-12-01 20:02:33 -08:00
f6025666eb Update etcd from v3.4.12 to v3.4.14
* https://github.com/etcd-io/etcd/releases/tag/v3.4.14
2020-11-29 20:04:25 -08:00
85eb502f19 Update Prometheus from v2.23.0-rc.0 to v2.23.0
* https://github.com/prometheus/prometheus/releases/tag/v2.23.0
2020-11-29 19:59:27 -08:00
fa3184fb9c Relax terraform-provider-ct version constraint
* Allow terraform-provider-ct versions v0.6+ (e.g. v0.7.1)
Before, only v0.6.x point updates were allowed
* Update terraform-provider-ct to v0.7.1 in docs
* READ the docs before updating terraform-provider-ct,
as changing worker user-data is handled differently
by different cloud platforms
2020-11-29 19:51:26 -08:00
22565e57e0 Update kube-state-metrics from v2.0.0-alpha.2 to v2.0.0-alpha.3
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-alpha.3
2020-11-25 14:30:11 -08:00
026e1f3648 Update Grafana from v7.3.3 to v7.3.4
* https://github.com/grafana/grafana/releases/tag/v7.3.4
2020-11-25 14:25:15 -08:00
ae548ce213 Update Calico from v3.16.5 to v3.17.0
* Enable Calico MTU auto-detection
* Remove [workaround](https://github.com/poseidon/typhoon/pull/724) to
Calico cni-plugin [issue](https://github.com/projectcalico/cni-plugin/issues/874)

Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/230
2020-11-25 14:22:58 -08:00
e826b49648 Update Matchbox profile to use initramfs and rootfs images
* Fedora CoreOS stable (after Oct 6) ships separate initramfs
and rootfs images, used as initrd's
* Update profiles to match the Matchbox examples, which have
already switched to the new profile and to remove the unused
kernel args
* Requires Fedora CoreOS version which ships rootfs images
(e.g. stable 32.20200923.3.0 or later)

Rel:

* https://github.com/coreos/fedora-coreos-tracker/issues/390#issuecomment-661986987
* da0df01763 (diff-4541f7b7c174f6ae6270135942c1c65ed9e09ebe81239709f5a9fb34e858ddcf)

Supercedes https://github.com/poseidon/typhoon/pull/888
2020-11-25 14:13:39 -08:00
fa8f68f50e Fix Fedora CoreOS AWS AMI query in non-US regions
* A `aws_ami` data source will fail a Terraform plan
if no matching AMI is found, even if the AMI is not
used. ARM64 images are only published to a few US
regions, so the `aws_ami` data query could fail when
creating Fedora CoreOS AWS clusters in non-US regions
* Condition `aws_ami` on whether experimental arch
`arm64` is chosen
* Recent regression introduced in v1.19.4
https://github.com/poseidon/typhoon/pull/875

Closes https://github.com/poseidon/typhoon/issues/886
2020-11-25 11:32:05 -08:00
ba8d972c76 Update Prometheus from v2.22.2 to v2.23.0-rc.0
* https://github.com/prometheus/prometheus/releases/tag/v2.23.0-rc.0
2020-11-24 10:54:42 -08:00
c0347ca0c6 Set kubeconfig and asset_dist as sensitive
* Mark `kubeconfig` and `asset_dist` as `sensitive` to
prevent the Terraform CLI displaying these values, esp.
for CI systems
* In particular, external tools or tfvars style uses (not
recommended) reportedly display all outputs and are improved
by setting sensitive
* For Terraform v0.14, outputs referencing sensitive fields
must also be annotated as sensitive

Closes https://github.com/poseidon/typhoon/issues/884
2020-11-23 11:41:55 -08:00
9f94ab6bcc Rerun terraform fmt for recent variables 2020-11-21 14:20:36 -08:00
5e4f5de271 Enable Network Load Balancer (NLB) dualstack
* NLB subnets assigned both IPv4 and IPv6 addresses
* NLB DNS name has both A and AAAA records
* NLB to target node traffic is IPv4 (no change),
no change to security groups needed
* Ingresses exposed through the recommended Nginx
Ingress Controller addon will be accessible via
IPv4 or IPv6. No change is needed to the app's
CNAME to NLB record

Related: https://aws.amazon.com/about-aws/whats-new/2020/11/network-load-balancer-supports-ipv6/
2020-11-21 14:16:24 -08:00
be28495d79 Update Prometheus from v2.22.1 to v2.22.2
* https://github.com/prometheus/prometheus/releases/tag/v2.22.2
2020-11-19 21:50:48 -08:00
f1356fec24 Update Grafana from v7.3.2 to v7.3.3
* https://github.com/grafana/grafana/releases/tag/v7.3.3
2020-11-19 21:49:11 -08:00
cc00afa4e1 Add Terraform v0.13 input variable validations
* Support for migrating from Terraform v0.12.x to v0.13.x
was added in v1.18.8
* Require Terraform v0.13+. Drop support for Terraform v0.12
2020-11-17 12:02:34 -08:00
5c3b5a20de Update recommended Terraform provider versions
* Sync Terraform provider plugins with those used internally
2020-11-14 13:32:04 -08:00
f5a83667e8 Update Grafana from v7.3.1 to v7.3.2
* https://github.com/grafana/grafana/releases/tag/v7.3.2
2020-11-14 13:30:30 -08:00
a911367c2e Update nginx-ingress from v0.41.0 to v0.41.2
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.41.2
2020-11-14 13:27:06 -08:00
f884de847e Discard Prometheus etcd gRPC failure alert
* Kubernetes watch expiry is not a gRPC code we care about
* Background: This rule is typically removed, but was added back in
2020-11-14 13:17:56 -08:00
1b3a0f6ebc Add experimental Fedora CoreOS arm64 support on AWS
* Add experimental `arch` variable to Fedora CoreOS AWS,
accepting amd64 (default) or arm64 to support native
arm64/aarch64 clusters or mixed/hybrid clusters with
a worker pool of arm64 workers
* Add `daemonset_tolerations` variable to cluster module
(experimental)
* Add `node_taints` variable to workers module
* Requires flannel CNI and experimental Poseidon-built
arm64 Fedora CoreOS AMIs (published to us-east-1, us-east-2,
and us-west-1)

WARN:

* Our AMIs are experimental, may be removed at any time, and
will be removed when Fedora CoreOS publishes official arm64
AMIs. Do NOT use in production

Related:

* https://github.com/poseidon/typhoon/pull/682
2020-11-14 13:09:24 -08:00
1113a22f61 Update Kubernetes from v1.19.3 to v1.19.4
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.19.md#v1194
2020-11-11 22:56:27 -08:00
152c7d86bd Change bootstrap.service container from rkt to docker
* Use docker to run `bootstrap.service` container
* Background https://github.com/poseidon/typhoon/pull/855
2020-11-11 22:26:05 -08:00
79deb8a967 Update Cilium from v1.9.0-rc3 to v1.9.0
* https://github.com/cilium/cilium/releases/tag/v1.9.0
2020-11-10 23:42:41 -08:00
f412f0d9f2 Update Calico from v3.16.4 to v3.16.5
* https://github.com/projectcalico/calico/releases/tag/v3.16.5
2020-11-10 22:58:19 -08:00
eca6c4a1a1 Fix broken flatcar linux documentation links (#870)
* Fix old documentation links
2020-11-10 18:30:30 -08:00
133d325013 Update nginx-ingress from v0.40.2 to v0.41.0
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.41.0
2020-11-08 14:34:52 -08:00
4b05c0180e Update Grafana from v7.3.0 to v7.3.1
* https://github.com/grafana/grafana/releases/tag/v7.3.1
2020-11-08 14:13:39 -08:00
f49ab3a6ee Update Prometheus from v2.22.0 to v2.22.1
* https://github.com/prometheus/prometheus/releases/tag/v2.22.1
2020-11-08 14:12:24 -08:00
0eef16b274 Improve and tidy Fedora CoreOS etcd-member.service
* Allow a snippet with a systemd dropin to set an alternate
image via `ETCD_IMAGE`, for consistency across Fedora CoreOS
and Flatcar Linux
* Drop comments about integrating system containers with
systemd-notify
2020-11-08 11:49:56 -08:00
ad1f59ce91 Change Flatcar etcd-member.service container from rkt to docker
* Use docker to run the `etcd-member.service` container
* Use env-file `/etc/etcd/etcd.env` like podman on FCOS
* Background: https://github.com/poseidon/typhoon/pull/855
2020-11-03 16:42:18 -08:00
82e5ac3e7c Update Cilium from v1.8.5 to v1.9.0-rc3
* https://github.com/poseidon/terraform-render-bootstrap/pull/224
2020-11-03 10:29:07 -08:00
a8f7880511 Update Cilium from v1.8.4 to v1.8.5
* https://github.com/cilium/cilium/releases/tag/v1.8.5
2020-10-29 00:50:18 -07:00
cda5b93b09 Update kube-state-metrics from v2.0.0-alpha.1 to v2.0.0-alpha.2
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-alpha.2
2020-10-28 18:49:40 -07:00
3e9f5f34de Update Grafana from v7.2.2 to v7.3.0
* https://github.com/grafana/grafana/releases/tag/v7.3.0
2020-10-28 17:46:26 -07:00
893d139590 Update Calico from v3.16.3 to v3.16.4
* https://github.com/projectcalico/calico/releases/tag/v3.16.4
2020-10-26 00:50:40 -07:00
fc62e51b2a Update Grafana from v7.2.1 to v7.2.2
* https://github.com/grafana/grafana/releases/tag/v7.2.2
2020-10-22 00:14:04 -07:00
e5ba3329eb Remove bare-metal CoreOS Container Linux profiles
* Remove Matchbox profiles for CoreOS Container Linux
* Simplify the remaining Flatcat Linux profiles
2020-10-21 00:25:10 -07:00
7c3f3ab6d0 Rename container-linux modules to flatcar-linux
* CoreOS Container Linux was deprecated in v1.18.3
* Continue transitioning docs and modules from supporting
both CoreOS and Flatcar "variants" of Container Linux to
now supporting Flatcar Linux and equivalents

Action Required: Update the Flatcar Linux modules `source`
to replace `s/container-linux/flatcar-linux`. See docs for
examples
2020-10-20 22:47:19 -07:00
a99a990d49 Remove unused Kubelet tls mounts
* Kubelet trusts only the cluster CA certificate (and
certificates in the Kubelet debian base image), there
is no longer a need to mount the host's trusted certs
* Similar change on Flatcar Linux in
https://github.com/poseidon/typhoon/pull/855

Rel: https://github.com/poseidon/typhoon/pull/810
2020-10-18 23:48:21 -07:00
df17253e72 Fix delete node permission on Fedora CoreOS node shutdown
* On cloud platforms, `delete-node.service` tries to delete the
local node (not always possible depending on preemption time)
* Since v1.18.3, kubelet TLS bootstrap generates a kubeconfig
in `/var/lib/kubelet` which should be used with kubectl in
the delete-node oneshot
2020-10-18 23:38:11 -07:00
eda78db08e Change Flatcar kubelet.service container from rkt to docker
* Use docker to run the `kubelet.service` container
* Update Kubelet mounts to match Fedora CoreOS
* Remove unused `/etc/ssl/certs` mount (see
https://github.com/poseidon/typhoon/pull/810)
* Remove unused `/usr/share/ca-certificates` mount
* Remove `/etc/resolv.conf` mount, Docker default is ok
* Change `delete-node.service` to use docker instead of rkt
and inline ExecStart, as was done on Fedora CoreOS
* Fix permission denied on shutdown `delete-node`, caused
by the kubeconfig mount changing with the introduction of
node TLS bootstrap

Background

* podmand, rkt, and runc daemonless container process runners
provide advantages over the docker daemon for system containers.
Docker requires workarounds for use in systemd units where the
ExecStart must tail logs so systemd can monitor the daemonized
container. https://github.com/moby/moby/issues/6791
* Why switch then? On Flatcar Linux, podman isn't shipped. rkt
works, but isn't developing while container standards continue
to move forward. Typhoon has used runc for the Kubelet runner
before in Fedora Atomic, but its more low-level. So we're left
with Docker, which is less than ideal, but shipped in Flatcar
* Flatcar Linux appears to be shifting system components to
use docker, which does provide some limited guards against
breakages (e.g. Flatcar cannot enable docker live restore)
2020-10-18 23:24:45 -07:00
afac46e39a Remove asset_dir variable and optional asset writes
* Originally, poseidon/terraform-render-bootstrap generated
TLS certificates, manifests, and cluster "assets" written
to local disk (`asset_dir`) during terraform apply cluster
bootstrap
* Typhoon v1.17.0 introduced bootstrapping using only Terraform
state to store cluster assets, to avoid ever writing sensitive
materials to disk and improve automated use-cases. `asset_dir`
was changed to optional and defaulted to "" (no writes)
* Typhoon v1.18.0 deprecated the `asset_dir` variable, removed
docs, and announced it would be deleted in future.
* Add Terraform output `assets_dir` map
* Remove the `asset_dir` variable

Cluster assets are now stored in Terraform state only. For those
who wish to write those assets to local files, this is possible
doing so explicitly.

```
resource local_file "assets" {
  for_each = module.yavin.assets_dist
  filename = "some-assets/${each.key}"
  content = each.value
}
```

Related:

* https://github.com/poseidon/typhoon/pull/595
* https://github.com/poseidon/typhoon/pull/678
2020-10-17 15:00:15 -07:00
b1e680ac0c Update recommended Terraform provider versions
* Sync Terraform provider plugins with those used internally
2020-10-17 13:56:24 -07:00
9fbfbdb854 Update Prometheus from v2.21.0 to v2.22.0
* https://github.com/prometheus/prometheus/releases/tag/v2.22.0
2020-10-17 12:38:25 -07:00
511f5272f4 Update Calico from v3.15.3 to v3.16.3
* https://github.com/projectcalico/calico/releases/tag/v3.16.3
* https://github.com/poseidon/terraform-render-bootstrap/pull/212
2020-10-15 20:08:51 -07:00
46ca5e8813 Update Kubernetes from v1.19.2 to v1.19.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.19.md#v1193
2020-10-14 20:47:49 -07:00
394e496cc7 Update Grafana from v7.2.0 to v7.2.1
* https://github.com/grafana/grafana/releases/tag/v7.2.1
2020-10-11 13:21:25 -07:00
a38ec1a856 Update recommended Terraform provider versions
* Sync Terraform provider plugins with those used internally
2020-10-11 13:06:53 -07:00
7881f4bd86 Update kube-state-metrics from v1.9.7 to v2.0.0-alpha.1
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-alpha
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-alpha.1
2020-10-11 12:35:43 -07:00
d5b5b7cb02 Update nginx-ingress from v0.40.0 to v0.40.2
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.40.2
2020-10-06 23:52:15 -07:00
759a48be7c Update mkdocs-material from v5.5.12 to v6.0.1
* Update OS kernel, systemd, and docker verisons
2020-10-02 01:18:38 -07:00
b39a1d70da Update nginx-ingress from v0.35.0 to v0.40.0
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.40.0
2020-10-02 01:00:35 -07:00
901f7939b2 Update Cilium from v1.8.3 to v1.8.4
* https://github.com/cilium/cilium/releases/tag/v1.8.4
2020-10-02 00:24:26 -07:00
d65085ce14 Update Grafana from v7.1.5 to v7.2.0
* https://github.com/grafana/grafana/releases/tag/v7.2.0
2020-09-24 20:58:32 -07:00
343db5b578 Remove references to CoreOS Container Linux
* CoreOS Container Linux was deprecated in v1.18.3 (May 2020)
in favor of Fedora CoreOS and Flatcar Linux. CoreOS Container
Linux references were kept to give folks more time to migrate,
but AMIs have now been deleted. Time is up.

Rel: https://coreos.com/os/eol/
2020-09-24 20:51:02 -07:00
190 changed files with 3323 additions and 2771 deletions

1
.github/FUNDING.yml vendored Normal file
View File

@ -0,0 +1 @@
github: [poseidon]

9
.github/dependabot.yaml vendored Normal file
View File

@ -0,0 +1,9 @@
version: 2
updates:
- package-ecosystem: pip
directory: "/"
schedule:
interval: weekly
pull-request-branch-name:
separator: "-"
open-pull-requests-limit: 3

View File

@ -4,6 +4,255 @@ Notable changes between versions.
## Latest ## Latest
## Kubernetes v1.21.0
* Kubernetes [v1.21.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md#v1211)
* Add Terraform v0.15.x support ([#974](https://github.com/poseidon/typhoon/pull/974))
* Continue to support Terraform v0.13.x and v0.14.4+
* Update etcd from v3.4.15 to [v3.4.16](https://github.com/etcd-io/etcd/releases/tag/v3.4.16)
* Update Cilium from v1.9.5 to [v1.9.6](https://github.com/cilium/cilium/releases/tag/v1.9.6)
* Update Calico from v3.18.1 to [v3.19.0](https://github.com/projectcalico/calico/releases/tag/v3.19.0)
### AWS
* Reduce the default `disk_size` from 40GB to 30GB ([#983](https://github.com/poseidon/typhoon/pull/983))
### Azure
* Reduce the default `disk_size` from 40GB to 30GB ([#983](https://github.com/poseidon/typhoon/pull/983))
### Google Cloud
* Reduce the default `disk_size` from 40GB to 30GB ([#983](https://github.com/poseidon/typhoon/pull/983))
### Fedora CoreOS
* Update Kubelet mounts for cgroups v2 ([#978](https://github.com/poseidon/typhoon/pull/978))
### Addons
* Update kube-state-metrics from v2.0.0-rc.1 to [v2.0.0](https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0)
* Update Prometheus from v2.25.2 to [v2.27.0](https://github.com/prometheus/prometheus/releases/tag/v2.27.0)
* Update Grafana from v7.5.3 to [v7.5.6](https://github.com/grafana/grafana/releases/tag/v7.5.6)
* Update nginx-ingress from v0.45.0 to [v0.46.0](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.46.0)
## v1.21.0
* Kubernetes [v1.21.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md#v1210)
* Enable `tokencleaner` controller ([#969](https://github.com/poseidon/typhoon/pull/969))
* Enable `kube-scheduler` and `kube-controller-manager` separate authn/z kubeconfig
* Change CNI config location from /etc/kubernetes/cni/net.d to /etc/cni/net.d ([#965](https://github.com/poseidon/typhoon/pull/965))
* Change `kube-controller-manager` to mount `/var/lib/kubelet/volumeplugins` directly
* Remove unused `cloud-provider` flags
* Update Fedora CoreOS Config version from v1.1.0 to v1.2.0 ([#970](https://github.com/poseidon/typhoon/pull/970))
* Require [poseidon/ct](https://github.com/poseidon/terraform-provider-ct) Terraform provider v0.8+ ([notes](https://typhoon.psdn.io/topics/maintenance/#upgrade-terraform-provider-ct))
* Require any [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customizations to update to v1.2.0
### AWS
* Allow setting custom initial node taints on worker pools ([#968](https://github.com/poseidon/typhoon/pull/968))
* Add `node_taints` variable to internal `workers` pool module to set initial node taints
* Add `daemonset_tolerations` so `kube-system` DaemonSets can tolerate custom taints
### Azure
* Allow setting custom initial node taints on worker pools ([#968](https://github.com/poseidon/typhoon/pull/968))
* Add `node_taints` variable to internal `workers` pool module to set initial node taints
* Add `daemonset_tolerations` so `kube-system` DaemonSets can tolerate custom taints
* Remove deprecated `azurerm_lb_backend_address_pool` field `resource_group_name` ([#972](https://github.com/poseidon/typhoon/pull/972))
### Google Cloud
* Allow setting custom initial node taints on worker pools ([#968](https://github.com/poseidon/typhoon/pull/968))
* Add `node_taints` variable to internal `workers` pool module to set initial node taints
* Add `daemonset_tolerations` so `kube-system` DaemonSets can tolerate custom taints
### Addons
* Update nginx-ingress from v0.44.0 to [v0.45.0](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.45.0)
* Update kube-state-metrics from v2.0.0-rc.0 to [v2.0.0-rc.1](https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-rc.1)
* Update Grafana from v7.4.5 to [v7.5.3](https://github.com/grafana/grafana/releases/tag/v7.5.3)
## v1.20.5
* Kubernetes [v1.20.5](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1205)
* Update etcd from v3.4.14 to [v3.4.15](https://github.com/etcd-io/etcd/releases/tag/v3.4.15)
* Update Cilium from v1.9.4 to [v1.9.5](https://github.com/cilium/cilium/releases/tag/v1.9.5)
* Update Calico from v3.17.3 to [v3.18.1](https://github.com/projectcalico/calico/releases/tag/v3.18.1)
* Update CoreDNS from v1.7.0 to [v1.8.0](https://coredns.io/2020/10/22/coredns-1.8.0-release/)
* Mark bootstrap token as sensitive in Terraform plans ([#949](https://github.com/poseidon/typhoon/pull/949))
### Fedora CoreOS
* Set Kubelet `provider-id` ([#951](https://github.com/poseidon/typhoon/pull/951))
### Flatcar Linux
#### AWS
* Set Kubelet `provider-id` ([#951](https://github.com/poseidon/typhoon/pull/951))
* Remove `os_image` option `flatcar-edge` ([#943](https://github.com/poseidon/typhoon/pull/943))
#### Azure
* Remove `os_image` option `flatcar-edge` ([#943](https://github.com/poseidon/typhoon/pull/943))
#### Bare-Metal
* Remove `os_channel` option `flatcar-edge` ([#943](https://github.com/poseidon/typhoon/pull/943))
### Addons
* Update Prometheus from v2.25.0 to [v2.25.2](https://github.com/prometheus/prometheus/releases/tag/v2.25.2)
* Update kube-state-metrics from v2.0.0-alpha.3 to [v2.0.0-rc.0](https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-rc.0)
* Switch image from `quay.io` to `k8s.gcr.io` ([#946](https://github.com/poseidon/typhoon/pull/946))
* Update node-exporter from v1.1.1 to [v1.1.2](https://github.com/prometheus/node_exporter/releases/tag/v1.1.2)
* Update Grafana from v7.4.2 to [v7.4.5](https://github.com/grafana/grafana/releases/tag/v7.4.5)
## v1.20.4
* Kubernetes [v1.20.4](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1204)
* Update Cilium from v1.9.1 to [v1.9.4](https://github.com/cilium/cilium/releases/tag/v1.9.4)
* Update Calico from v3.17.1 to [v3.17.3](https://github.com/projectcalico/calico/releases/tag/v3.17.3)
* Update flannel-cni from v0.4.1 to [v0.4.2](https://github.com/poseidon/flannel-cni/releases/tag/v0.4.2)
### Addons
* Update nginx-ingress from v0.43.0 to [v0.44.0](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.44.0)
* Update Prometheus from v2.24.0 to [v2.25.0](https://github.com/prometheus/prometheus/releases/tag/v2.25.0)
* Update node-exporter from v1.0.1 to [v1.1.1](https://github.com/prometheus/node_exporter/releases/tag/v1.1.1)
* Update Grafana from v7.3.7 to [v7.4.2](https://github.com/grafana/grafana/releases/tag/v7.4.2)
## v1.20.2
* Kubernetes [v1.20.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1202)
* Support Terraform v0.13.x and v0.14.4+ ([#924](https://github.com/poseidon/typhoon/pull/923))
### Addons
* Update nginx-ingress from v0.41.2 to [v0.43.0](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.43.0)
* Update Prometheus from v2.23.0 to [v2.24.0](https://github.com/prometheus/prometheus/releases/tag/v2.24.0)
* Update Grafana from v7.3.6 to [v7.3.7](https://github.com/grafana/grafana/releases/tag/v7.3.7)
## v1.20.1
* Kubernetes [v1.20.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1201)
### Fedora CoreOS
* Fedora CoreOS 33 has stronger crypto defaults ([**notice**](https://docs.fedoraproject.org/en-US/fedora-coreos/faq/#_why_does_ssh_stop_working_after_upgrading_to_fedora_33), [#915](https://github.com/poseidon/typhoon/issues/915))
* Use a non-RSA SSH key or add the workaround provided in upstream [Fedora docs](https://docs.fedoraproject.org/en-US/fedora-coreos/faq/#_why_does_ssh_stop_working_after_upgrading_to_fedora_33) as a [snippet](https://typhoon.psdn.io/advanced/customization/#fedora-coreos) (**action required**)
### Addons
* Update Grafana from v7.3.5 to [v7.3.6](https://github.com/grafana/grafana/releases/tag/v7.3.6)
## v1.20.0
* Kubernetes [v1.20.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#v1200)
* Add input variable validations ([#880](https://github.com/poseidon/typhoon/pull/880))
* Require Terraform v0.13+ ([migration guide](https://typhoon.psdn.io/topics/maintenance/#terraform-versions))
* Set output sensitive to suppress console display for some cases ([#885](https://github.com/poseidon/typhoon/pull/885))
* Add service account token [volume projection](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#service-account-token-volume-projection) ([#897](https://github.com/poseidon/typhoon/pull/897))
* Scope kube-scheduler and kube-controller-manager permissions ([#898](https://github.com/poseidon/typhoon/pull/898))
* Update etcd from v3.4.12 to [v3.4.14](https://github.com/etcd-io/etcd/releases/tag/v3.4.14)
* Update Calico from v3.16.5 to v3.17.1 ([#890](https://github.com/poseidon/typhoon/pull/890))
* Enable Calico MTU auto-detection
* Remove [workaround](https://github.com/poseidon/typhoon/pull/724) to Calico cni-plugin [issue](https://github.com/projectcalico/cni-plugin/issues/874)
* Update Cilium from v1.9.0 to [v1.9.1](https://github.com/cilium/cilium/releases/tag/v1.9.1)
* Relax `terraform-provider-ct` version constraint to v0.6+ ([#893](https://github.com/poseidon/typhoon/pull/893))
* Allow upgrading `terraform-provider-ct` to v0.7.x ([warn](https://typhoon.psdn.io/topics/maintenance/#upgrade-terraform-provider-ct))
### AWS
* Enable Network Load Balancer (NLB) dualstack ([#883](https://github.com/poseidon/typhoon/pull/883))
* NLB subnets assigned both IPv4 and IPv6 addresses
* NLB DNS name has both A and AAAA records
* NLB to target node traffic is IPv4 (no change)
### Bare-Metal
* Remove iSCSI `/etc/iscsi` and `iscsadm` mounts from Kubelet ([#912](https://github.com/poseidon/typhoon/pull/912))
### Fedora CoreOS
#### AWS
* Fix AMI query for which could fail in some regions ([#887](https://github.com/poseidon/typhoon/pull/887))
#### Bare-Metal
* Promote Fedora CoreOS to stable
* Use initramfs and rootfs images as initrd's ([#889](https://github.com/poseidon/typhoon/pull/889))
* Requires Fedora CoreOS version with rootfs images (e.g. 32.20200923.3.0+)
### Addons
* Update Prometheus from v2.22.2 to [v2.23.0](https://github.com/prometheus/prometheus/releases/tag/v2.23.0)
* Update kube-state-metrics from v2.0.0-alpha.2 to [v2.0.0-alpha.3](https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-alpha.3)
* Update Grafana from v7.3.2 to [v7.3.5](https://github.com/grafana/grafana/releases/tag/v7.3.5)
## v1.19.4
* Kubernetes [v1.19.4](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.19.md#v1194)
* Update Cilium from v1.8.4 to [v1.9.0](https://github.com/cilium/cilium/releases/tag/v1.9.0)
* Update Calico from v3.16.3 to [v3.16.5](https://github.com/projectcalico/calico/releases/tag/v3.16.5)
* Remove `asset_dir` variable (defaulted off in [v1.17.0](https://github.com/poseidon/typhoon/pull/595), deprecated in [v1.18.0](https://github.com/poseidon/typhoon/pull/678))
### Fedora CoreOS
* Improve `etcd-member.service` systemd unit ([#868](https://github.com/poseidon/typhoon/pull/868))
* Allow a snippet with a systemd dropin to set an alternate image (e.g. mirror)
* Fix local node delete oneshot on node shutdown ([#856](https://github.com/poseidon/typhoon/pull/855))
#### AWS
* Add experimental Fedora CoreOS arm64 support ([docs](https://typhoon.psdn.io/advanced/arm64/), [#875](https://github.com/poseidon/typhoon/pull/875))
* Allow arm64 full-cluster or mixed/hybrid cluster with worker pools
* Add `arch` variable to cluster module
* Add `daemonset_tolerations` variable to cluster module
* Add `node_taints` variable to workers module
* Requires flannel CNI provider and use of experimental AMI (see docs)
### Flatcar Linux
* Rename `container-linux` modules to `flatcar-linux` ([#858](https://github.com/poseidon/typhoon/issues/858)) (**action required**)
* Change on-host system containers from rkt to docker
* Change `etcd-member.service` container runnner from rkt to docker ([#867](https://github.com/poseidon/typhoon/pull/867))
* Change `kubelet.service` container runner from rkt-fly to docker ([#855](https://github.com/poseidon/typhoon/pull/855))
* Change `bootstrap.service` container runner from rkt to docker ([#873](https://github.com/poseidon/typhoon/pull/873))
* Change `delete-node.service` to use docker and an inline ExecStart ([#855](https://github.com/poseidon/typhoon/pull/855))
* Fix local node delete oneshot on node shutdown ([#855](https://github.com/poseidon/typhoon/pull/855))
* Remove CoreOS Container Linux Matchbox profiles ([#859](https://github.com/poseidon/typhoon/pull/858))
### Addons
* Update nginx-ingress from v0.40.2 to [v0.41.2](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.41.2)
* Update Prometheus from v2.22.0 to [v2.22.1](https://github.com/prometheus/prometheus/releases/tag/v2.22.1)
* Update kube-state-metrics from v2.0.0-alpha.1 to [v2.0.0-alpha.2](https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-alpha.2)
* Update Grafana from v7.2.1 to [v7.3.2](https://github.com/grafana/grafana/releases/tag/v7.3.2)
## v1.19.3
* Kubernetes [v1.19.3](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.19.md#v1193)
* Update Cilium from v1.8.3 to [v1.8.4](https://github.com/cilium/cilium/releases/tag/v1.8.4)
* Update Calico from v1.15.3 to [v1.16.3](https://github.com/projectcalico/calico/releases/tag/v3.16.3) ([#851](https://github.com/poseidon/typhoon/pull/851))
* Update flannel from v0.13.0-rc2 to v0.13.0 ([#219](https://github.com/poseidon/terraform-render-bootstrap/pull/219))
### Flatcar Linux
* Remove references to CoreOS Container Linux ([#839](https://github.com/poseidon/typhoon/pull/839))
* Fix error querying for coreos AMI on AWS ([#838](https://github.com/poseidon/typhoon/issues/838))
### Addons
* Update nginx-ingress from v0.35.0 to [v0.40.2](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.40.2)
* Update Grafana from v7.1.5 to [v7.2.1](https://github.com/grafana/grafana/releases/tag/v7.2.1)
* Update Prometheus from v2.21.0 to [v2.22.0](https://github.com/prometheus/prometheus/releases/tag/v2.22.0)
* Update kube-state-metrics from v1.9.7 to [v2.0.0-alpha.1](https://github.com/kubernetes/kube-state-metrics/releases/tag/v2.0.0-alpha.1)
## v1.19.2
* Kubernetes [v1.19.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.19.md#v1192) * Kubernetes [v1.19.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.19.md#v1192)
* Update flannel from v0.12.0 to v0.13.0-rc2 ([#216](https://github.com/poseidon/terraform-render-bootstrap/pull/216)) * Update flannel from v0.12.0 to v0.13.0-rc2 ([#216](https://github.com/poseidon/terraform-render-bootstrap/pull/216))
* Update flannel-cni from v0.4.0 to v0.4.1 * Update flannel-cni from v0.4.0 to v0.4.1

View File

@ -11,10 +11,10 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a> ## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.19.2 (upstream) * Kubernetes v1.21.1 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking * Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [preemptible](https://typhoon.psdn.io/cl/google-cloud/#preemption) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization * Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [preemptible](https://typhoon.psdn.io/flatcar-linux/google-cloud/#preemption) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization
* Ready for Ingress, Prometheus, Grafana, CSI, or other [addons](https://typhoon.psdn.io/addons/overview/) * Ready for Ingress, Prometheus, Grafana, CSI, or other [addons](https://typhoon.psdn.io/addons/overview/)
## Modules ## Modules
@ -27,19 +27,23 @@ Typhoon is available for [Fedora CoreOS](https://getfedora.org/coreos/).
|---------------|------------------|------------------|--------| |---------------|------------------|------------------|--------|
| AWS | Fedora CoreOS | [aws/fedora-coreos/kubernetes](aws/fedora-coreos/kubernetes) | stable | | AWS | Fedora CoreOS | [aws/fedora-coreos/kubernetes](aws/fedora-coreos/kubernetes) | stable |
| Azure | Fedora CoreOS | [azure/fedora-coreos/kubernetes](azure/fedora-coreos/kubernetes) | alpha | | Azure | Fedora CoreOS | [azure/fedora-coreos/kubernetes](azure/fedora-coreos/kubernetes) | alpha |
| Bare-Metal | Fedora CoreOS | [bare-metal/fedora-coreos/kubernetes](bare-metal/fedora-coreos/kubernetes) | beta | | Bare-Metal | Fedora CoreOS | [bare-metal/fedora-coreos/kubernetes](bare-metal/fedora-coreos/kubernetes) | stable |
| DigitalOcean | Fedora CoreOS | [digital-ocean/fedora-coreos/kubernetes](digital-ocean/fedora-coreos/kubernetes) | beta | | DigitalOcean | Fedora CoreOS | [digital-ocean/fedora-coreos/kubernetes](digital-ocean/fedora-coreos/kubernetes) | beta |
| Google Cloud | Fedora CoreOS | [google-cloud/fedora-coreos/kubernetes](google-cloud/fedora-coreos/kubernetes) | stable | | Google Cloud | Fedora CoreOS | [google-cloud/fedora-coreos/kubernetes](google-cloud/fedora-coreos/kubernetes) | stable |
| Platform | Operating System | Terraform Module | Status |
|---------------|------------------|------------------|--------|
| AWS | Fedora CoreOS (ARM64) | [aws/fedora-coreos/kubernetes](aws/fedora-coreos/kubernetes) | alpha |
Typhoon is available for [Flatcar Linux](https://www.flatcar-linux.org/releases/). Typhoon is available for [Flatcar Linux](https://www.flatcar-linux.org/releases/).
| Platform | Operating System | Terraform Module | Status | | Platform | Operating System | Terraform Module | Status |
|---------------|------------------|------------------|--------| |---------------|------------------|------------------|--------|
| AWS | Flatcar Linux | [aws/container-linux/kubernetes](aws/container-linux/kubernetes) | stable | | AWS | Flatcar Linux | [aws/flatcar-linux/kubernetes](aws/flatcar-linux/kubernetes) | stable |
| Azure | Flatcar Linux | [azure/container-linux/kubernetes](azure/container-linux/kubernetes) | alpha | | Azure | Flatcar Linux | [azure/flatcar-linux/kubernetes](azure/flatcar-linux/kubernetes) | alpha |
| Bare-Metal | Flatcar Linux | [bare-metal/container-linux/kubernetes](bare-metal/container-linux/kubernetes) | stable | | Bare-Metal | Flatcar Linux | [bare-metal/flatcar-linux/kubernetes](bare-metal/flatcar-linux/kubernetes) | stable |
| DigitalOcean | Flatcar Linux | [digital-ocean/container-linux/kubernetes](digital-ocean/container-linux/kubernetes) | beta | | DigitalOcean | Flatcar Linux | [digital-ocean/flatcar-linux/kubernetes](digital-ocean/flatcar-linux/kubernetes) | beta |
| Google Cloud | Flatcar Linux | [google-cloud/container-linux/kubernetes](google-cloud/container-linux/kubernetes) | beta | | Google Cloud | Flatcar Linux | [google-cloud/flatcar-linux/kubernetes](google-cloud/flatcar-linux/kubernetes) | beta |
## Documentation ## Documentation
@ -54,7 +58,7 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo
```tf ```tf
module "yavin" { module "yavin" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.19.2" source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.21.1"
# Google Cloud # Google Cloud
cluster_name = "yavin" cluster_name = "yavin"
@ -93,9 +97,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
$ export KUBECONFIG=/home/user/.kube/configs/yavin-config $ export KUBECONFIG=/home/user/.kube/configs/yavin-config
$ kubectl get nodes $ kubectl get nodes
NAME ROLES STATUS AGE VERSION NAME ROLES STATUS AGE VERSION
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.19.2 yavin-controller-0.c.example-com.internal <none> Ready 6m v1.21.1
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.19.2 yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.21.1
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.19.2 yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.21.1
``` ```
List the pods. List the pods.
@ -126,7 +130,7 @@ Typhoon is strict about minimalism, maturity, and scope. These are not in scope:
## Help ## Help
Ask questions on the IRC #typhoon channel on [freenode.net](http://freenode.net/). Schedule a meeting via [Github Sponsors](https://github.com/sponsors/poseidon?frequency=one-time) to discuss your use case. You can also ask questions on the IRC #typhoon channel on [freenode.net](http://freenode.net/) (unmonitored).
## Motivation ## Motivation
@ -136,12 +140,17 @@ Typhoon addresses real world needs, which you may share. It is honest about limi
## Social Contract ## Social Contract
Typhoon is not a product, trial, or free-tier. It is not run by a company, does not offer support or services, and does not accept or make any money. It is not associated with any operating system or platform vendor. Typhoon is not a product, trial, or free-tier. Typhoon does not offer support, services, or charge money. And Typhoon is independent of operating system or platform vendors.
Typhoon clusters will contain only [free](https://www.debian.org/intro/free) components. Cluster components will not collect data on users without their permission. Typhoon clusters will contain only [free](https://www.debian.org/intro/free) components. Cluster components will not collect data on users without their permission.
## Donations ## Sponsors
Typhoon does not accept money donations. Instead, we encourage you to donate to one of [these organizations](https://github.com/poseidon/typhoon/wiki/Donations) to show your appreciation. Poseidon's Github [Sponsors](https://github.com/sponsors/poseidon) support the infrastructure and operational costs of providing Typhoon.
* [DigitalOcean](https://www.digitalocean.com/) kindly provides credits to support Typhoon test clusters. <a href="https://www.digitalocean.com/">
<img src="https://opensource.nyc3.cdn.digitaloceanspaces.com/attribution/assets/SVG/DO_Logo_horizontal_blue.svg" width="201px">
</a>
<br>
If you'd like your company here, please contact dghubble at psdn.io.

View File

@ -37,6 +37,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -129,6 +130,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -221,6 +223,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -326,6 +329,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -432,6 +436,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -537,6 +542,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -643,6 +649,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -762,6 +769,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -854,6 +862,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },

View File

@ -172,7 +172,7 @@ data:
"tableColumn": "", "tableColumn": "",
"targets": [ "targets": [
{ {
"expr": "sum(kubelet_running_pods{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"})", "expr": "sum(kubelet_running_pods{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"}) OR sum(kubelet_running_pod_count{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"})",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}}", "legendFormat": "{{instance}}",
@ -256,7 +256,7 @@ data:
"tableColumn": "", "tableColumn": "",
"targets": [ "targets": [
{ {
"expr": "sum(kubelet_running_containers{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"})", "expr": "sum(kubelet_running_containers{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"}) OR sum(kubelet_running_container_count{cluster=\"$cluster\", job=\"kubelet\", instance=~\"$instance\"})",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{instance}}", "legendFormat": "{{instance}}",
@ -553,6 +553,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -645,6 +646,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -750,6 +752,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -855,6 +858,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -954,6 +958,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1066,6 +1071,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1160,6 +1166,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1267,6 +1274,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1374,6 +1382,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1466,6 +1475,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1572,6 +1582,7 @@ data:
"datasource": "$datasource", "datasource": "$datasource",
"description": "Pod lifecycle event generator", "description": "Pod lifecycle event generator",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1664,6 +1675,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1769,6 +1781,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1874,6 +1887,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -2000,6 +2014,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -2105,6 +2120,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -2197,6 +2213,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -2289,6 +2306,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -2613,6 +2631,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -2705,6 +2724,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -2810,6 +2830,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -2902,6 +2923,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -3007,6 +3029,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -3120,6 +3143,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -3225,6 +3249,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -3330,6 +3355,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -3422,6 +3448,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -3514,6 +3541,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },

View File

@ -60,7 +60,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "1 - avg(rate(node_cpu_seconds_total{mode=\"idle\", cluster=\"$cluster\"}[$__interval]))", "expr": "1 - avg(rate(node_cpu_seconds_total{mode=\"idle\", cluster=\"$cluster\"}[$__rate_interval]))",
"format": "time_series", "format": "time_series",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -1586,7 +1586,7 @@ data:
], ],
"targets": [ "targets": [
{ {
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)", "expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -1595,7 +1595,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)", "expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -1604,7 +1604,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)", "expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -1613,7 +1613,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)", "expr": "sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -1622,7 +1622,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)", "expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -1631,7 +1631,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)", "expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -1731,7 +1731,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)", "expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{namespace}}", "legendFormat": "{{namespace}}",
@ -1829,7 +1829,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)", "expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{namespace}}", "legendFormat": "{{namespace}}",
@ -1927,7 +1927,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)", "expr": "avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{namespace}}", "legendFormat": "{{namespace}}",
@ -2025,7 +2025,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)", "expr": "avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{namespace}}", "legendFormat": "{{namespace}}",
@ -2123,7 +2123,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)", "expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{namespace}}", "legendFormat": "{{namespace}}",
@ -2221,7 +2221,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)", "expr": "sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{namespace}}", "legendFormat": "{{namespace}}",
@ -2319,7 +2319,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)", "expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{namespace}}", "legendFormat": "{{namespace}}",
@ -2417,7 +2417,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$__interval])) by (namespace)", "expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\".+\"}[$__rate_interval])) by (namespace)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{namespace}}", "legendFormat": "{{namespace}}",
@ -4019,7 +4019,7 @@ data:
], ],
"targets": [ "targets": [
{ {
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)", "expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -4028,7 +4028,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)", "expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -4037,7 +4037,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)", "expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -4046,7 +4046,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)", "expr": "sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -4055,7 +4055,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)", "expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -4064,7 +4064,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)", "expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -4164,7 +4164,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)", "expr": "sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -4262,7 +4262,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)", "expr": "sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -4360,7 +4360,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)", "expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -4458,7 +4458,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)", "expr": "sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -4556,7 +4556,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)", "expr": "sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -4654,7 +4654,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])) by (pod)", "expr": "sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])) by (pod)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",

View File

@ -1058,7 +1058,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(irate(container_network_receive_bytes_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__interval])) by (pod)", "expr": "sum(irate(container_network_receive_bytes_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -1157,7 +1157,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(irate(container_network_transmit_bytes_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__interval])) by (pod)", "expr": "sum(irate(container_network_transmit_bytes_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -1256,7 +1256,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(irate(container_network_receive_packets_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__interval])) by (pod)", "expr": "sum(irate(container_network_receive_packets_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -1355,7 +1355,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(irate(container_network_transmit_packets_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__interval])) by (pod)", "expr": "sum(irate(container_network_transmit_packets_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -1454,7 +1454,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(irate(container_network_receive_packets_dropped_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__interval])) by (pod)", "expr": "sum(irate(container_network_receive_packets_dropped_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -1553,7 +1553,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "sum(irate(container_network_transmit_packets_dropped_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__interval])) by (pod)", "expr": "sum(irate(container_network_transmit_packets_dropped_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -2707,7 +2707,7 @@ data:
], ],
"targets": [ "targets": [
{ {
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", "expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -2716,7 +2716,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", "expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -2725,7 +2725,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", "expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -2734,7 +2734,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", "expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -2743,7 +2743,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", "expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -2752,7 +2752,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", "expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -2852,7 +2852,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", "expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -2950,7 +2950,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", "expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -3048,7 +3048,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "(avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", "expr": "(avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -3146,7 +3146,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "(avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", "expr": "(avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -3244,7 +3244,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", "expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -3342,7 +3342,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", "expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -3440,7 +3440,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", "expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -3538,7 +3538,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", "expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{pod}}", "legendFormat": "{{pod}}",
@ -4902,7 +4902,7 @@ data:
], ],
"targets": [ "targets": [
{ {
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n", "expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -4911,7 +4911,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n", "expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -4920,7 +4920,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n", "expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -4929,7 +4929,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n", "expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -4938,7 +4938,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n", "expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -4947,7 +4947,7 @@ data:
"step": 10 "step": 10
}, },
{ {
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n", "expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload_type=\"$type\"}) by (workload))\n",
"format": "table", "format": "table",
"instant": true, "instant": true,
"intervalFactor": 2, "intervalFactor": 2,
@ -5047,7 +5047,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", "expr": "(sum(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{workload}}", "legendFormat": "{{workload}}",
@ -5145,7 +5145,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", "expr": "(sum(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{workload}}", "legendFormat": "{{workload}}",
@ -5243,7 +5243,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "(avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", "expr": "(avg(irate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{workload}}", "legendFormat": "{{workload}}",
@ -5341,7 +5341,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "(avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", "expr": "(avg(irate(container_network_transmit_bytes_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{workload}}", "legendFormat": "{{workload}}",
@ -5439,7 +5439,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", "expr": "(sum(irate(container_network_receive_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{workload}}", "legendFormat": "{{workload}}",
@ -5537,7 +5537,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", "expr": "(sum(irate(container_network_transmit_packets_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{workload}}", "legendFormat": "{{workload}}",
@ -5635,7 +5635,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", "expr": "(sum(irate(container_network_receive_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{workload}}", "legendFormat": "{{workload}}",
@ -5733,7 +5733,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", "expr": "(sum(irate(container_network_transmit_packets_dropped_total{cluster=\"$cluster\", namespace=~\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"$cluster\", namespace=~\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{workload}}", "legendFormat": "{{workload}}",

View File

@ -140,8 +140,9 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"decimals": 3, "decimals": 3,
"description": "How much error budget is left looking at our 0.990% availability gurantees?", "description": "How much error budget is left looking at our 0.990% availability guarantees?",
"fill": 10, "fill": 10,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -336,6 +337,7 @@ data:
"datasource": "$datasource", "datasource": "$datasource",
"description": "How many read requests (LIST,GET) per second do the apiservers get by code?", "description": "How many read requests (LIST,GET) per second do the apiservers get by code?",
"fill": 10, "fill": 10,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -444,6 +446,7 @@ data:
"datasource": "$datasource", "datasource": "$datasource",
"description": "How many percent of read requests (LIST,GET) per second are returned with errors (5xx)?", "description": "How many percent of read requests (LIST,GET) per second are returned with errors (5xx)?",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -537,6 +540,7 @@ data:
"datasource": "$datasource", "datasource": "$datasource",
"description": "How many seconds is the 99th percentile for reading (LIST|GET) a given resource?", "description": "How many seconds is the 99th percentile for reading (LIST|GET) a given resource?",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -729,6 +733,7 @@ data:
"datasource": "$datasource", "datasource": "$datasource",
"description": "How many write requests (POST|PUT|PATCH|DELETE) per second do the apiservers get by code?", "description": "How many write requests (POST|PUT|PATCH|DELETE) per second do the apiservers get by code?",
"fill": 10, "fill": 10,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -837,6 +842,7 @@ data:
"datasource": "$datasource", "datasource": "$datasource",
"description": "How many percent of write requests (POST|PUT|PATCH|DELETE) per second are returned with errors (5xx)?", "description": "How many percent of write requests (POST|PUT|PATCH|DELETE) per second are returned with errors (5xx)?",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -930,6 +936,7 @@ data:
"datasource": "$datasource", "datasource": "$datasource",
"description": "How many seconds is the 99th percentile for writing (POST|PUT|PATCH|DELETE) a given resource?", "description": "How many seconds is the 99th percentile for writing (POST|PUT|PATCH|DELETE) a given resource?",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1035,6 +1042,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1127,6 +1135,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1219,6 +1228,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1324,6 +1334,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1416,6 +1427,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1508,6 +1520,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1832,6 +1845,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1937,6 +1951,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -2042,6 +2057,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -2147,6 +2163,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -2260,6 +2277,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -2365,6 +2383,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -2470,6 +2489,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -2562,6 +2582,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -2654,6 +2675,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -2868,6 +2890,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -3019,7 +3042,7 @@ data:
"tableColumn": "", "tableColumn": "",
"targets": [ "targets": [
{ {
"expr": "(\n kubelet_volume_stats_capacity_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n -\n kubelet_volume_stats_available_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n)\n/\nkubelet_volume_stats_capacity_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n* 100\n", "expr": "max without(instance,node) (\n(\n kubelet_volume_stats_capacity_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n -\n kubelet_volume_stats_available_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n)\n/\nkubelet_volume_stats_capacity_bytes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n* 100)\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "", "legendFormat": "",
@ -3064,6 +3087,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -3215,7 +3239,7 @@ data:
"tableColumn": "", "tableColumn": "",
"targets": [ "targets": [
{ {
"expr": "kubelet_volume_stats_inodes_used{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n/\nkubelet_volume_stats_inodes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n* 100\n", "expr": "max without(instance,node) (\nkubelet_volume_stats_inodes_used{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n/\nkubelet_volume_stats_inodes{cluster=\"$cluster\", job=\"kubelet\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"}\n* 100)\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "", "legendFormat": "",
@ -3505,6 +3529,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -3618,6 +3643,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -3744,6 +3770,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -3857,6 +3884,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -3962,6 +3990,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -4067,6 +4096,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -4159,6 +4189,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -4251,6 +4282,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -4516,7 +4548,7 @@ data:
"tableColumn": "", "tableColumn": "",
"targets": [ "targets": [
{ {
"expr": "sum(rate(container_cpu_usage_seconds_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$statefulset.*\"}[3m]))", "expr": "sum(rate(container_cpu_usage_seconds_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", container!=\"\", namespace=\"$namespace\", pod=~\"$statefulset.*\"}[3m]))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "", "legendFormat": "",
@ -4599,7 +4631,7 @@ data:
"tableColumn": "", "tableColumn": "",
"targets": [ "targets": [
{ {
"expr": "sum(container_memory_usage_bytes{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$statefulset.*\"}) / 1024^3", "expr": "sum(container_memory_usage_bytes{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", container!=\"\", namespace=\"$namespace\", pod=~\"$statefulset.*\"}) / 1024^3",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "", "legendFormat": "",
@ -4682,7 +4714,7 @@ data:
"tableColumn": "", "tableColumn": "",
"targets": [ "targets": [
{ {
"expr": "sum(rate(container_network_transmit_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$statefulset.*\"}[3m])) + sum(rate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=\"$namespace\",pod=~\"$statefulset.*\"}[3m]))", "expr": "sum(rate(container_network_transmit_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$statefulset.*\"}[3m])) + sum(rate(container_network_receive_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\",pod=~\"$statefulset.*\"}[3m]))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "", "legendFormat": "",
@ -5077,6 +5109,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },

View File

@ -172,7 +172,7 @@ data:
"tableColumn": "", "tableColumn": "",
"targets": [ "targets": [
{ {
"expr": "sum(avg_over_time(nginx_ingress_controller_nginx_process_connections{cluster=~\"$cluster\", controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\"}[2m]))", "expr": "sum(avg_over_time(nginx_ingress_controller_nginx_process_connections{cluster=~\"$cluster\", controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\",state=\"active\"}[2m]))",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "", "legendFormat": "",
@ -296,6 +296,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -388,6 +389,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -493,6 +495,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -612,6 +615,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -711,6 +715,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -803,6 +808,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },

View File

@ -36,6 +36,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -129,6 +130,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 0, "fill": 0,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -255,6 +257,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -420,7 +423,7 @@ data:
"tableColumn": "", "tableColumn": "",
"targets": [ "targets": [
{ {
"expr": "100 -\n(\n node_memory_MemAvailable_bytes{job=\"node-exporter\", instance=\"$instance\"}\n/\n node_memory_MemTotal_bytes{job=\"node-exporter\", instance=\"$instance\"}\n* 100\n)\n", "expr": "100 -\n(\n avg(node_memory_MemAvailable_bytes{job=\"node-exporter\", instance=\"$instance\"})\n/\n avg(node_memory_MemTotal_bytes{job=\"node-exporter\", instance=\"$instance\"})\n* 100\n)\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "", "legendFormat": "",
@ -462,6 +465,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 0, "fill": 0,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -578,6 +582,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -697,6 +702,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 0, "fill": 0,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -790,6 +796,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 0, "fill": 0,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },

View File

@ -21,7 +21,7 @@ data:
"links": [ "links": [
], ],
"refresh": "", "refresh": "60s",
"rows": [ "rows": [
{ {
"collapse": false, "collapse": false,
@ -36,6 +36,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -72,7 +73,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "(\n prometheus_remote_storage_highest_timestamp_in_seconds{cluster=~\"$cluster\", instance=~\"$instance\"} \n- \n ignoring(remote_name, url) group_right(instance) prometheus_remote_storage_queue_highest_sent_timestamp_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}\n)\n", "expr": "(\n prometheus_remote_storage_highest_timestamp_in_seconds{cluster=~\"$cluster\", instance=~\"$instance\"} \n- \n ignoring(remote_name, url) group_right(instance) (prometheus_remote_storage_queue_highest_sent_timestamp_seconds{cluster=~\"$cluster\", instance=~\"$instance\"} != 0)\n)\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}", "legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
@ -128,6 +129,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -164,7 +166,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "(\n rate(prometheus_remote_storage_highest_timestamp_in_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) \n- \n ignoring (remote_name, url) group_right(instance) rate(prometheus_remote_storage_queue_highest_sent_timestamp_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n)\n", "expr": "clamp_min(\n rate(prometheus_remote_storage_highest_timestamp_in_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) \n- \n ignoring (remote_name, url) group_right(instance) rate(prometheus_remote_storage_queue_highest_sent_timestamp_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n, 0)\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}", "legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
@ -233,6 +235,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -269,7 +272,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "rate(\n prometheus_remote_storage_samples_in_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n- \n ignoring(remote_name, url) group_right(instance) rate(prometheus_remote_storage_succeeded_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n- \n rate(prometheus_remote_storage_dropped_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n", "expr": "rate(\n prometheus_remote_storage_samples_in_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n- \n ignoring(remote_name, url) group_right(instance) (rate(prometheus_remote_storage_succeeded_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) or rate(prometheus_remote_storage_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]))\n- \n (rate(prometheus_remote_storage_dropped_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) or rate(prometheus_remote_storage_samples_dropped_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]))\n",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}", "legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
@ -338,6 +341,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -431,6 +435,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -523,6 +528,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -615,6 +621,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -720,6 +727,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -812,6 +820,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -848,7 +857,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "prometheus_remote_storage_pending_samples{cluster=~\"$cluster\", instance=~\"$instance\"}", "expr": "prometheus_remote_storage_pending_samples{cluster=~\"$cluster\", instance=~\"$instance\"} or prometheus_remote_storage_samples_pending{cluster=~\"$cluster\", instance=~\"$instance\"}",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}", "legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
@ -917,6 +926,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1009,6 +1019,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1114,6 +1125,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1150,7 +1162,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "rate(prometheus_remote_storage_dropped_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])", "expr": "rate(prometheus_remote_storage_dropped_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) or rate(prometheus_remote_storage_samples_dropped_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}", "legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
@ -1206,6 +1218,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1242,7 +1255,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "rate(prometheus_remote_storage_failed_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])", "expr": "rate(prometheus_remote_storage_failed_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) or rate(prometheus_remote_storage_samples_failed_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}", "legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
@ -1298,6 +1311,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1334,7 +1348,7 @@ data:
"steppedLine": false, "steppedLine": false,
"targets": [ "targets": [
{ {
"expr": "rate(prometheus_remote_storage_retried_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])", "expr": "rate(prometheus_remote_storage_retried_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) or rate(prometheus_remote_storage_samples_retried_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])",
"format": "time_series", "format": "time_series",
"intervalFactor": 2, "intervalFactor": 2,
"legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}", "legendFormat": "{{cluster}}:{{instance}} {{remote_name}}:{{url}}",
@ -1390,6 +1404,7 @@ data:
"dashes": false, "dashes": false,
"datasource": "$datasource", "datasource": "$datasource",
"fill": 1, "fill": 1,
"fillGradient": 0,
"gridPos": { "gridPos": {
}, },
@ -1486,7 +1501,7 @@ data:
"schemaVersion": 14, "schemaVersion": 14,
"style": "dark", "style": "dark",
"tags": [ "tags": [
"prometheus-mixin"
], ],
"templating": { "templating": {
"list": [ "list": [
@ -1630,7 +1645,7 @@ data:
] ]
}, },
"timezone": "browser", "timezone": "browser",
"title": "Prometheus Remote Write", "title": "Prometheus / Remote Write",
"version": 0 "version": 0
} }
prometheus.json: |- prometheus.json: |-
@ -1647,7 +1662,7 @@ data:
"links": [ "links": [
], ],
"refresh": "10s", "refresh": "60s",
"rows": [ "rows": [
{ {
"collapse": false, "collapse": false,
@ -2726,7 +2741,7 @@ data:
"schemaVersion": 14, "schemaVersion": 14,
"style": "dark", "style": "dark",
"tags": [ "tags": [
"prometheus-mixin"
], ],
"templating": { "templating": {
"list": [ "list": [
@ -2834,7 +2849,7 @@ data:
] ]
}, },
"timezone": "utc", "timezone": "utc",
"title": "Prometheus Overview", "title": "Prometheus / Overview",
"uid": "", "uid": "",
"version": 0 "version": 0
} }

View File

@ -24,7 +24,7 @@ spec:
type: RuntimeDefault type: RuntimeDefault
containers: containers:
- name: grafana - name: grafana
image: docker.io/grafana/grafana:7.1.5 image: docker.io/grafana/grafana:7.5.6
env: env:
- name: GF_PATHS_CONFIG - name: GF_PATHS_CONFIG
value: "/etc/grafana/custom.ini" value: "/etc/grafana/custom.ini"

View File

@ -23,7 +23,7 @@ spec:
type: RuntimeDefault type: RuntimeDefault
containers: containers:
- name: nginx-ingress-controller - name: nginx-ingress-controller
image: k8s.gcr.io/ingress-nginx/controller:v0.35.0 image: k8s.gcr.io/ingress-nginx/controller:v0.46.0
args: args:
- /nginx-ingress-controller - /nginx-ingress-controller
- --ingress-class=public - --ingress-class=public

View File

@ -23,7 +23,7 @@ spec:
type: RuntimeDefault type: RuntimeDefault
containers: containers:
- name: nginx-ingress-controller - name: nginx-ingress-controller
image: k8s.gcr.io/ingress-nginx/controller:v0.35.0 image: k8s.gcr.io/ingress-nginx/controller:v0.46.0
args: args:
- /nginx-ingress-controller - /nginx-ingress-controller
- --ingress-class=public - --ingress-class=public

View File

@ -23,7 +23,7 @@ spec:
type: RuntimeDefault type: RuntimeDefault
containers: containers:
- name: nginx-ingress-controller - name: nginx-ingress-controller
image: k8s.gcr.io/ingress-nginx/controller:v0.35.0 image: k8s.gcr.io/ingress-nginx/controller:v0.46.0
args: args:
- /nginx-ingress-controller - /nginx-ingress-controller
- --ingress-class=public - --ingress-class=public

View File

@ -23,7 +23,7 @@ spec:
type: RuntimeDefault type: RuntimeDefault
containers: containers:
- name: nginx-ingress-controller - name: nginx-ingress-controller
image: k8s.gcr.io/ingress-nginx/controller:v0.35.0 image: k8s.gcr.io/ingress-nginx/controller:v0.46.0
args: args:
- /nginx-ingress-controller - /nginx-ingress-controller
- --ingress-class=public - --ingress-class=public

View File

@ -23,7 +23,7 @@ spec:
type: RuntimeDefault type: RuntimeDefault
containers: containers:
- name: nginx-ingress-controller - name: nginx-ingress-controller
image: k8s.gcr.io/ingress-nginx/controller:v0.35.0 image: k8s.gcr.io/ingress-nginx/controller:v0.46.0
args: args:
- /nginx-ingress-controller - /nginx-ingress-controller
- --ingress-class=public - --ingress-class=public

View File

@ -21,7 +21,7 @@ spec:
serviceAccountName: prometheus serviceAccountName: prometheus
containers: containers:
- name: prometheus - name: prometheus
image: quay.io/prometheus/prometheus:v2.21.0 image: quay.io/prometheus/prometheus:v2.27.0
args: args:
- --web.listen-address=0.0.0.0:9090 - --web.listen-address=0.0.0.0:9090
- --config.file=/etc/prometheus/prometheus.yaml - --config.file=/etc/prometheus/prometheus.yaml

View File

@ -78,13 +78,6 @@ rules:
verbs: verbs:
- list - list
- watch - watch
- apiGroups:
- autoscaling.k8s.io
resources:
- verticalpodautoscalers
verbs:
- list
- watch
- apiGroups: - apiGroups:
- admissionregistration.k8s.io - admissionregistration.k8s.io
resources: resources:
@ -97,6 +90,14 @@ rules:
- networking.k8s.io - networking.k8s.io
resources: resources:
- networkpolicies - networkpolicies
- ingresses
verbs:
- list
- watch
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs: verbs:
- list - list
- watch - watch

View File

@ -25,10 +25,12 @@ spec:
serviceAccountName: kube-state-metrics serviceAccountName: kube-state-metrics
containers: containers:
- name: kube-state-metrics - name: kube-state-metrics
image: quay.io/coreos/kube-state-metrics:v1.9.7 image: k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.0.0
ports: ports:
- name: metrics - name: metrics
containerPort: 8080 containerPort: 8080
- name: telemetry
containerPort: 8081
livenessProbe: livenessProbe:
httpGet: httpGet:
path: /healthz path: /healthz
@ -41,3 +43,5 @@ spec:
port: 8081 port: 8081
initialDelaySeconds: 5 initialDelaySeconds: 5
timeoutSeconds: 5 timeoutSeconds: 5
securityContext:
runAsUser: 65534

View File

@ -28,7 +28,7 @@ spec:
hostPID: true hostPID: true
containers: containers:
- name: node-exporter - name: node-exporter
image: quay.io/prometheus/node-exporter:v1.0.1 image: quay.io/prometheus/node-exporter:v1.1.2
args: args:
- --path.procfs=/host/proc - --path.procfs=/host/proc
- --path.sysfs=/host/sys - --path.sysfs=/host/sys

View File

@ -9,7 +9,8 @@ data:
{ {
"alert": "etcdMembersDown", "alert": "etcdMembersDown",
"annotations": { "annotations": {
"message": "etcd cluster \"{{ $labels.job }}\": members are down ({{ $value }})." "description": "etcd cluster \"{{ $labels.job }}\": members are down ({{ $value }}).",
"summary": "etcd cluster members are down."
}, },
"expr": "max without (endpoint) (\n sum without (instance) (up{job=~\".*etcd.*\"} == bool 0)\nor\n count without (To) (\n sum without (instance) (rate(etcd_network_peer_sent_failures_total{job=~\".*etcd.*\"}[120s])) > 0.01\n )\n)\n> 0\n", "expr": "max without (endpoint) (\n sum without (instance) (up{job=~\".*etcd.*\"} == bool 0)\nor\n count without (To) (\n sum without (instance) (rate(etcd_network_peer_sent_failures_total{job=~\".*etcd.*\"}[120s])) > 0.01\n )\n)\n> 0\n",
"for": "10m", "for": "10m",
@ -20,7 +21,8 @@ data:
{ {
"alert": "etcdInsufficientMembers", "alert": "etcdInsufficientMembers",
"annotations": { "annotations": {
"message": "etcd cluster \"{{ $labels.job }}\": insufficient members ({{ $value }})." "description": "etcd cluster \"{{ $labels.job }}\": insufficient members ({{ $value }}).",
"summary": "etcd cluster has insufficient number of members."
}, },
"expr": "sum(up{job=~\".*etcd.*\"} == bool 1) without (instance) < ((count(up{job=~\".*etcd.*\"}) without (instance) + 1) / 2)\n", "expr": "sum(up{job=~\".*etcd.*\"} == bool 1) without (instance) < ((count(up{job=~\".*etcd.*\"}) without (instance) + 1) / 2)\n",
"for": "3m", "for": "3m",
@ -31,7 +33,8 @@ data:
{ {
"alert": "etcdNoLeader", "alert": "etcdNoLeader",
"annotations": { "annotations": {
"message": "etcd cluster \"{{ $labels.job }}\": member {{ $labels.instance }} has no leader." "description": "etcd cluster \"{{ $labels.job }}\": member {{ $labels.instance }} has no leader.",
"summary": "etcd cluster has no leader."
}, },
"expr": "etcd_server_has_leader{job=~\".*etcd.*\"} == 0\n", "expr": "etcd_server_has_leader{job=~\".*etcd.*\"} == 0\n",
"for": "1m", "for": "1m",
@ -42,7 +45,8 @@ data:
{ {
"alert": "etcdHighNumberOfLeaderChanges", "alert": "etcdHighNumberOfLeaderChanges",
"annotations": { "annotations": {
"message": "etcd cluster \"{{ $labels.job }}\": {{ $value }} leader changes within the last 15 minutes. Frequent elections may be a sign of insufficient resources, high network latency, or disruptions by other components and should be investigated." "description": "etcd cluster \"{{ $labels.job }}\": {{ $value }} leader changes within the last 15 minutes. Frequent elections may be a sign of insufficient resources, high network latency, or disruptions by other components and should be investigated.",
"summary": "etcd cluster has high number of leader changes."
}, },
"expr": "increase((max without (instance) (etcd_server_leader_changes_seen_total{job=~\".*etcd.*\"}) or 0*absent(etcd_server_leader_changes_seen_total{job=~\".*etcd.*\"}))[15m:1m]) >= 4\n", "expr": "increase((max without (instance) (etcd_server_leader_changes_seen_total{job=~\".*etcd.*\"}) or 0*absent(etcd_server_leader_changes_seen_total{job=~\".*etcd.*\"}))[15m:1m]) >= 4\n",
"for": "5m", "for": "5m",
@ -50,32 +54,11 @@ data:
"severity": "warning" "severity": "warning"
} }
}, },
{
"alert": "etcdHighNumberOfFailedGRPCRequests",
"annotations": {
"message": "etcd cluster \"{{ $labels.job }}\": {{ $value }}% of requests for {{ $labels.grpc_method }} failed on etcd instance {{ $labels.instance }}."
},
"expr": "100 * sum(rate(grpc_server_handled_total{job=~\".*etcd.*\", grpc_code!=\"OK\"}[5m])) without (grpc_type, grpc_code)\n /\nsum(rate(grpc_server_handled_total{job=~\".*etcd.*\"}[5m])) without (grpc_type, grpc_code)\n > 1\n",
"for": "10m",
"labels": {
"severity": "warning"
}
},
{
"alert": "etcdHighNumberOfFailedGRPCRequests",
"annotations": {
"message": "etcd cluster \"{{ $labels.job }}\": {{ $value }}% of requests for {{ $labels.grpc_method }} failed on etcd instance {{ $labels.instance }}."
},
"expr": "100 * sum(rate(grpc_server_handled_total{job=~\".*etcd.*\", grpc_code!=\"OK\"}[5m])) without (grpc_type, grpc_code)\n /\nsum(rate(grpc_server_handled_total{job=~\".*etcd.*\"}[5m])) without (grpc_type, grpc_code)\n > 5\n",
"for": "5m",
"labels": {
"severity": "critical"
}
},
{ {
"alert": "etcdGRPCRequestsSlow", "alert": "etcdGRPCRequestsSlow",
"annotations": { "annotations": {
"message": "etcd cluster \"{{ $labels.job }}\": gRPC requests to {{ $labels.grpc_method }} are taking {{ $value }}s on etcd instance {{ $labels.instance }}." "description": "etcd cluster \"{{ $labels.job }}\": gRPC requests to {{ $labels.grpc_method }} are taking {{ $value }}s on etcd instance {{ $labels.instance }}.",
"summary": "etcd grpc requests are slow"
}, },
"expr": "histogram_quantile(0.99, sum(rate(grpc_server_handling_seconds_bucket{job=~\".*etcd.*\", grpc_type=\"unary\"}[5m])) without(grpc_type))\n> 0.15\n", "expr": "histogram_quantile(0.99, sum(rate(grpc_server_handling_seconds_bucket{job=~\".*etcd.*\", grpc_type=\"unary\"}[5m])) without(grpc_type))\n> 0.15\n",
"for": "10m", "for": "10m",
@ -86,7 +69,8 @@ data:
{ {
"alert": "etcdMemberCommunicationSlow", "alert": "etcdMemberCommunicationSlow",
"annotations": { "annotations": {
"message": "etcd cluster \"{{ $labels.job }}\": member communication with {{ $labels.To }} is taking {{ $value }}s on etcd instance {{ $labels.instance }}." "description": "etcd cluster \"{{ $labels.job }}\": member communication with {{ $labels.To }} is taking {{ $value }}s on etcd instance {{ $labels.instance }}.",
"summary": "etcd cluster member communication is slow."
}, },
"expr": "histogram_quantile(0.99, rate(etcd_network_peer_round_trip_time_seconds_bucket{job=~\".*etcd.*\"}[5m]))\n> 0.15\n", "expr": "histogram_quantile(0.99, rate(etcd_network_peer_round_trip_time_seconds_bucket{job=~\".*etcd.*\"}[5m]))\n> 0.15\n",
"for": "10m", "for": "10m",
@ -97,7 +81,8 @@ data:
{ {
"alert": "etcdHighNumberOfFailedProposals", "alert": "etcdHighNumberOfFailedProposals",
"annotations": { "annotations": {
"message": "etcd cluster \"{{ $labels.job }}\": {{ $value }} proposal failures within the last 30 minutes on etcd instance {{ $labels.instance }}." "description": "etcd cluster \"{{ $labels.job }}\": {{ $value }} proposal failures within the last 30 minutes on etcd instance {{ $labels.instance }}.",
"summary": "etcd cluster has high number of proposal failures."
}, },
"expr": "rate(etcd_server_proposals_failed_total{job=~\".*etcd.*\"}[15m]) > 5\n", "expr": "rate(etcd_server_proposals_failed_total{job=~\".*etcd.*\"}[15m]) > 5\n",
"for": "15m", "for": "15m",
@ -108,7 +93,8 @@ data:
{ {
"alert": "etcdHighFsyncDurations", "alert": "etcdHighFsyncDurations",
"annotations": { "annotations": {
"message": "etcd cluster \"{{ $labels.job }}\": 99th percentile fync durations are {{ $value }}s on etcd instance {{ $labels.instance }}." "description": "etcd cluster \"{{ $labels.job }}\": 99th percentile fsync durations are {{ $value }}s on etcd instance {{ $labels.instance }}.",
"summary": "etcd cluster 99th percentile fsync durations are too high."
}, },
"expr": "histogram_quantile(0.99, rate(etcd_disk_wal_fsync_duration_seconds_bucket{job=~\".*etcd.*\"}[5m]))\n> 0.5\n", "expr": "histogram_quantile(0.99, rate(etcd_disk_wal_fsync_duration_seconds_bucket{job=~\".*etcd.*\"}[5m]))\n> 0.5\n",
"for": "10m", "for": "10m",
@ -116,10 +102,22 @@ data:
"severity": "warning" "severity": "warning"
} }
}, },
{
"alert": "etcdHighFsyncDurations",
"annotations": {
"message": "etcd cluster \"{{ $labels.job }}\": 99th percentile fsync durations are {{ $value }}s on etcd instance {{ $labels.instance }}."
},
"expr": "histogram_quantile(0.99, rate(etcd_disk_wal_fsync_duration_seconds_bucket{job=~\".*etcd.*\"}[5m]))\n> 1\n",
"for": "10m",
"labels": {
"severity": "critical"
}
},
{ {
"alert": "etcdHighCommitDurations", "alert": "etcdHighCommitDurations",
"annotations": { "annotations": {
"message": "etcd cluster \"{{ $labels.job }}\": 99th percentile commit durations {{ $value }}s on etcd instance {{ $labels.instance }}." "description": "etcd cluster \"{{ $labels.job }}\": 99th percentile commit durations {{ $value }}s on etcd instance {{ $labels.instance }}.",
"summary": "etcd cluster 99th percentile commit durations are too high."
}, },
"expr": "histogram_quantile(0.99, rate(etcd_disk_backend_commit_duration_seconds_bucket{job=~\".*etcd.*\"}[5m]))\n> 0.25\n", "expr": "histogram_quantile(0.99, rate(etcd_disk_backend_commit_duration_seconds_bucket{job=~\".*etcd.*\"}[5m]))\n> 0.25\n",
"for": "10m", "for": "10m",
@ -130,7 +128,8 @@ data:
{ {
"alert": "etcdHighNumberOfFailedHTTPRequests", "alert": "etcdHighNumberOfFailedHTTPRequests",
"annotations": { "annotations": {
"message": "{{ $value }}% of requests for {{ $labels.method }} failed on etcd instance {{ $labels.instance }}" "description": "{{ $value }}% of requests for {{ $labels.method }} failed on etcd instance {{ $labels.instance }}",
"summary": "etcd has high number of failed HTTP requests."
}, },
"expr": "sum(rate(etcd_http_failed_total{job=~\".*etcd.*\", code!=\"404\"}[5m])) without (code) / sum(rate(etcd_http_received_total{job=~\".*etcd.*\"}[5m]))\nwithout (code) > 0.01\n", "expr": "sum(rate(etcd_http_failed_total{job=~\".*etcd.*\", code!=\"404\"}[5m])) without (code) / sum(rate(etcd_http_received_total{job=~\".*etcd.*\"}[5m]))\nwithout (code) > 0.01\n",
"for": "10m", "for": "10m",
@ -141,7 +140,8 @@ data:
{ {
"alert": "etcdHighNumberOfFailedHTTPRequests", "alert": "etcdHighNumberOfFailedHTTPRequests",
"annotations": { "annotations": {
"message": "{{ $value }}% of requests for {{ $labels.method }} failed on etcd instance {{ $labels.instance }}." "description": "{{ $value }}% of requests for {{ $labels.method }} failed on etcd instance {{ $labels.instance }}.",
"summary": "etcd has high number of failed HTTP requests."
}, },
"expr": "sum(rate(etcd_http_failed_total{job=~\".*etcd.*\", code!=\"404\"}[5m])) without (code) / sum(rate(etcd_http_received_total{job=~\".*etcd.*\"}[5m]))\nwithout (code) > 0.05\n", "expr": "sum(rate(etcd_http_failed_total{job=~\".*etcd.*\", code!=\"404\"}[5m])) without (code) / sum(rate(etcd_http_received_total{job=~\".*etcd.*\"}[5m]))\nwithout (code) > 0.05\n",
"for": "10m", "for": "10m",
@ -152,13 +152,36 @@ data:
{ {
"alert": "etcdHTTPRequestsSlow", "alert": "etcdHTTPRequestsSlow",
"annotations": { "annotations": {
"message": "etcd instance {{ $labels.instance }} HTTP requests to {{ $labels.method }} are slow." "description": "etcd instance {{ $labels.instance }} HTTP requests to {{ $labels.method }} are slow.",
"summary": "etcd instance HTTP requests are slow."
}, },
"expr": "histogram_quantile(0.99, rate(etcd_http_successful_duration_seconds_bucket[5m]))\n> 0.15\n", "expr": "histogram_quantile(0.99, rate(etcd_http_successful_duration_seconds_bucket[5m]))\n> 0.15\n",
"for": "10m", "for": "10m",
"labels": { "labels": {
"severity": "warning" "severity": "warning"
} }
},
{
"alert": "etcdBackendQuotaLowSpace",
"annotations": {
"message": "etcd cluster \"{{ $labels.job }}\": database size exceeds the defined quota on etcd instance {{ $labels.instance }}, please defrag or increase the quota as the writes to etcd will be disabled when it is full."
},
"expr": "(etcd_mvcc_db_total_size_in_bytes/etcd_server_quota_backend_bytes)*100 > 95\n",
"for": "10m",
"labels": {
"severity": "critical"
}
},
{
"alert": "etcdExcessiveDatabaseGrowth",
"annotations": {
"message": "etcd cluster \"{{ $labels.job }}\": Observed surge in etcd writes leading to 50% increase in database size over the past four hours on etcd instance {{ $labels.instance }}, please check as it might be disruptive."
},
"expr": "increase(((etcd_mvcc_db_total_size_in_bytes/etcd_server_quota_backend_bytes)*100)[240m:1m]) > 50\n",
"for": "10m",
"labels": {
"severity": "warning"
}
} }
] ]
} }
@ -298,10 +321,6 @@ data:
}, },
"record": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile" "record": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile"
}, },
{
"expr": "sum(rate(apiserver_request_duration_seconds_sum{subresource!=\"log\",verb!~\"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT\"}[5m])) without(instance, pod)\n/\nsum(rate(apiserver_request_duration_seconds_count{subresource!=\"log\",verb!~\"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT\"}[5m])) without(instance, pod)\n",
"record": "cluster:apiserver_request_duration_seconds:mean5m"
},
{ {
"expr": "histogram_quantile(0.99, sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",subresource!=\"log\",verb!~\"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT\"}[5m])) without(instance, pod))\n", "expr": "histogram_quantile(0.99, sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",subresource!=\"log\",verb!~\"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT\"}[5m])) without(instance, pod))\n",
"labels": { "labels": {
@ -465,10 +484,6 @@ data:
{ {
"name": "k8s.rules", "name": "k8s.rules",
"rules": [ "rules": [
{
"expr": "sum(rate(container_cpu_usage_seconds_total{job=\"kubernetes-cadvisor\", image!=\"\", container!=\"POD\"}[5m])) by (namespace)\n",
"record": "namespace:container_cpu_usage_seconds_total:sum_rate"
},
{ {
"expr": "sum by (cluster, namespace, pod, container) (\n rate(container_cpu_usage_seconds_total{job=\"kubernetes-cadvisor\", image!=\"\", container!=\"POD\"}[5m])\n) * on (cluster, namespace, pod) group_left(node) topk by (cluster, namespace, pod) (\n 1, max by(cluster, namespace, pod, node) (kube_pod_info{node!=\"\"})\n)\n", "expr": "sum by (cluster, namespace, pod, container) (\n rate(container_cpu_usage_seconds_total{job=\"kubernetes-cadvisor\", image!=\"\", container!=\"POD\"}[5m])\n) * on (cluster, namespace, pod) group_left(node) topk by (cluster, namespace, pod) (\n 1, max by(cluster, namespace, pod, node) (kube_pod_info{node!=\"\"})\n)\n",
"record": "node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate" "record": "node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate"
@ -489,10 +504,6 @@ data:
"expr": "container_memory_swap{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) topk by(namespace, pod) (1,\n max by(namespace, pod, node) (kube_pod_info{node!=\"\"})\n)\n", "expr": "container_memory_swap{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) topk by(namespace, pod) (1,\n max by(namespace, pod, node) (kube_pod_info{node!=\"\"})\n)\n",
"record": "node_namespace_pod_container:container_memory_swap" "record": "node_namespace_pod_container:container_memory_swap"
}, },
{
"expr": "sum(container_memory_usage_bytes{job=\"kubernetes-cadvisor\", image!=\"\", container!=\"POD\"}) by (namespace)\n",
"record": "namespace:container_memory_usage_bytes:sum"
},
{ {
"expr": "sum by (namespace) (\n sum by (namespace, pod) (\n max by (namespace, pod, container) (\n kube_pod_container_resource_requests_memory_bytes{job=\"kube-state-metrics\"}\n ) * on(namespace, pod) group_left() max by (namespace, pod) (\n kube_pod_status_phase{phase=~\"Pending|Running\"} == 1\n )\n )\n)\n", "expr": "sum by (namespace) (\n sum by (namespace, pod) (\n max by (namespace, pod, container) (\n kube_pod_container_resource_requests_memory_bytes{job=\"kube-state-metrics\"}\n ) * on(namespace, pod) group_left() max by (namespace, pod) (\n kube_pod_status_phase{phase=~\"Pending|Running\"} == 1\n )\n )\n)\n",
"record": "namespace:kube_pod_container_resource_requests_memory_bytes:sum" "record": "namespace:kube_pod_container_resource_requests_memory_bytes:sum"
@ -595,10 +606,6 @@ data:
{ {
"name": "node.rules", "name": "node.rules",
"rules": [ "rules": [
{
"expr": "sum(min(kube_pod_info{node!=\"\"}) by (cluster, node))\n",
"record": ":kube_pod_info_node_count:"
},
{ {
"expr": "topk by(namespace, pod) (1,\n max by (node, namespace, pod) (\n label_replace(kube_pod_info{job=\"kube-state-metrics\",node!=\"\"}, \"pod\", \"$1\", \"pod\", \"(.*)\")\n))\n", "expr": "topk by(namespace, pod) (1,\n max by (node, namespace, pod) (\n label_replace(kube_pod_info{job=\"kube-state-metrics\",node!=\"\"}, \"pod\", \"$1\", \"pod\", \"(.*)\")\n))\n",
"record": "node_namespace_pod:kube_pod_info:" "record": "node_namespace_pod:kube_pod_info:"
@ -801,7 +808,7 @@ data:
{ {
"alert": "KubeJobFailed", "alert": "KubeJobFailed",
"annotations": { "annotations": {
"description": "Job {{ $labels.namespace }}/{{ $labels.job_name }} failed to complete.", "description": "Job {{ $labels.namespace }}/{{ $labels.job_name }} failed to complete. Removing failed job after investigation should clear this alert.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubejobfailed", "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubejobfailed",
"summary": "Job failed to complete." "summary": "Job failed to complete."
}, },
@ -818,7 +825,7 @@ data:
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubehpareplicasmismatch", "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubehpareplicasmismatch",
"summary": "HPA has not matched descired number of replicas." "summary": "HPA has not matched descired number of replicas."
}, },
"expr": "(kube_hpa_status_desired_replicas{job=\"kube-state-metrics\"}\n !=\nkube_hpa_status_current_replicas{job=\"kube-state-metrics\"})\n and\nchanges(kube_hpa_status_current_replicas[15m]) == 0\n", "expr": "(kube_hpa_status_desired_replicas{job=\"kube-state-metrics\"}\n !=\nkube_hpa_status_current_replicas{job=\"kube-state-metrics\"})\n and\n(kube_hpa_status_current_replicas{job=\"kube-state-metrics\"}\n >\nkube_hpa_spec_min_replicas{job=\"kube-state-metrics\"})\n and\n(kube_hpa_status_current_replicas{job=\"kube-state-metrics\"}\n <\nkube_hpa_spec_max_replicas{job=\"kube-state-metrics\"})\n and\nchanges(kube_hpa_status_current_replicas[15m]) == 0\n",
"for": "15m", "for": "15m",
"labels": { "labels": {
"severity": "warning" "severity": "warning"
@ -888,7 +895,7 @@ data:
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubememoryquotaovercommit", "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubememoryquotaovercommit",
"summary": "Cluster has overcommitted memory resource requests." "summary": "Cluster has overcommitted memory resource requests."
}, },
"expr": "sum(kube_resourcequota{job=\"kube-state-metrics\", type=\"hard\", resource=\"memory\"})\n /\nsum(kube_node_status_allocatable_memory_bytes{job=\"node-exporter\"})\n > 1.5\n", "expr": "sum(kube_resourcequota{job=\"kube-state-metrics\", type=\"hard\", resource=\"memory\"})\n /\nsum(kube_node_status_allocatable_memory_bytes{job=\"kube-state-metrics\"})\n > 1.5\n",
"for": "5m", "for": "5m",
"labels": { "labels": {
"severity": "warning" "severity": "warning"
@ -1118,11 +1125,11 @@ data:
{ {
"alert": "AggregatedAPIErrors", "alert": "AggregatedAPIErrors",
"annotations": { "annotations": {
"description": "An aggregated API {{ $labels.name }}/{{ $labels.namespace }} has reported errors. The number of errors have increased for it in the past five minutes. High values indicate that the availability of the service changes too often.", "description": "An aggregated API {{ $labels.name }}/{{ $labels.namespace }} has reported errors. It has appeared unavailable {{ $value | humanize }} times averaged over the past 10m.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-aggregatedapierrors", "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-aggregatedapierrors",
"summary": "An aggregated API has reported errors." "summary": "An aggregated API has reported errors."
}, },
"expr": "sum by(name, namespace)(increase(aggregator_unavailable_apiservice_count[5m])) > 2\n", "expr": "sum by(name, namespace)(increase(aggregator_unavailable_apiservice_count[10m])) > 4\n",
"labels": { "labels": {
"severity": "warning" "severity": "warning"
} }
@ -1363,115 +1370,6 @@ data:
} }
] ]
} }
loki.yaml: |-
{
"groups": [
{
"name": "loki_rules",
"rules": [
{
"expr": "histogram_quantile(0.99, sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, job))",
"record": "job:loki_request_duration_seconds:99quantile"
},
{
"expr": "histogram_quantile(0.50, sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, job))",
"record": "job:loki_request_duration_seconds:50quantile"
},
{
"expr": "sum(rate(loki_request_duration_seconds_sum[1m])) by (job) / sum(rate(loki_request_duration_seconds_count[1m])) by (job)",
"record": "job:loki_request_duration_seconds:avg"
},
{
"expr": "sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, job)",
"record": "job:loki_request_duration_seconds_bucket:sum_rate"
},
{
"expr": "sum(rate(loki_request_duration_seconds_sum[1m])) by (job)",
"record": "job:loki_request_duration_seconds_sum:sum_rate"
},
{
"expr": "sum(rate(loki_request_duration_seconds_count[1m])) by (job)",
"record": "job:loki_request_duration_seconds_count:sum_rate"
},
{
"expr": "histogram_quantile(0.99, sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, job, route))",
"record": "job_route:loki_request_duration_seconds:99quantile"
},
{
"expr": "histogram_quantile(0.50, sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, job, route))",
"record": "job_route:loki_request_duration_seconds:50quantile"
},
{
"expr": "sum(rate(loki_request_duration_seconds_sum[1m])) by (job, route) / sum(rate(loki_request_duration_seconds_count[1m])) by (job, route)",
"record": "job_route:loki_request_duration_seconds:avg"
},
{
"expr": "sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, job, route)",
"record": "job_route:loki_request_duration_seconds_bucket:sum_rate"
},
{
"expr": "sum(rate(loki_request_duration_seconds_sum[1m])) by (job, route)",
"record": "job_route:loki_request_duration_seconds_sum:sum_rate"
},
{
"expr": "sum(rate(loki_request_duration_seconds_count[1m])) by (job, route)",
"record": "job_route:loki_request_duration_seconds_count:sum_rate"
},
{
"expr": "histogram_quantile(0.99, sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, namespace, job, route))",
"record": "namespace_job_route:loki_request_duration_seconds:99quantile"
},
{
"expr": "histogram_quantile(0.50, sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, namespace, job, route))",
"record": "namespace_job_route:loki_request_duration_seconds:50quantile"
},
{
"expr": "sum(rate(loki_request_duration_seconds_sum[1m])) by (namespace, job, route) / sum(rate(loki_request_duration_seconds_count[1m])) by (namespace, job, route)",
"record": "namespace_job_route:loki_request_duration_seconds:avg"
},
{
"expr": "sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, namespace, job, route)",
"record": "namespace_job_route:loki_request_duration_seconds_bucket:sum_rate"
},
{
"expr": "sum(rate(loki_request_duration_seconds_sum[1m])) by (namespace, job, route)",
"record": "namespace_job_route:loki_request_duration_seconds_sum:sum_rate"
},
{
"expr": "sum(rate(loki_request_duration_seconds_count[1m])) by (namespace, job, route)",
"record": "namespace_job_route:loki_request_duration_seconds_count:sum_rate"
}
]
},
{
"name": "loki_alerts",
"rules": [
{
"alert": "LokiRequestErrors",
"annotations": {
"message": "{{ $labels.job }} {{ $labels.route }} is experiencing {{ printf \"%.2f\" $value }}% errors.\n"
},
"expr": "100 * sum(rate(loki_request_duration_seconds_count{status_code=~\"5..\"}[1m])) by (namespace, job, route)\n /\nsum(rate(loki_request_duration_seconds_count[1m])) by (namespace, job, route)\n > 10\n",
"for": "15m",
"labels": {
"severity": "critical"
}
},
{
"alert": "LokiRequestLatency",
"annotations": {
"message": "{{ $labels.job }} {{ $labels.route }} is experiencing {{ printf \"%.2f\" $value }}s 99th percentile latency.\n"
},
"expr": "namespace_job_route:loki_request_duration_seconds:99quantile{route!~\"(?i).*tail.*\"} > 1\n",
"for": "15m",
"labels": {
"severity": "critical"
}
}
]
}
]
}
node-exporter.yaml: |- node-exporter.yaml: |-
{ {
"groups": [ "groups": [
@ -1629,7 +1527,7 @@ data:
"description": "{{ $labels.instance }} interface {{ $labels.device }} has encountered {{ printf \"%.0f\" $value }} receive errors in the last two minutes.", "description": "{{ $labels.instance }} interface {{ $labels.device }} has encountered {{ printf \"%.0f\" $value }} receive errors in the last two minutes.",
"summary": "Network interface is reporting many receive errors." "summary": "Network interface is reporting many receive errors."
}, },
"expr": "increase(node_network_receive_errs_total[2m]) > 10\n", "expr": "rate(node_network_receive_errs_total[2m]) / rate(node_network_receive_packets_total[2m]) > 0.01\n",
"for": "1h", "for": "1h",
"labels": { "labels": {
"severity": "warning" "severity": "warning"
@ -1641,7 +1539,7 @@ data:
"description": "{{ $labels.instance }} interface {{ $labels.device }} has encountered {{ printf \"%.0f\" $value }} transmit errors in the last two minutes.", "description": "{{ $labels.instance }} interface {{ $labels.device }} has encountered {{ printf \"%.0f\" $value }} transmit errors in the last two minutes.",
"summary": "Network interface is reporting many transmit errors." "summary": "Network interface is reporting many transmit errors."
}, },
"expr": "increase(node_network_transmit_errs_total[2m]) > 10\n", "expr": "rate(node_network_transmit_errs_total[2m]) / rate(node_network_transmit_packets_total[2m]) > 0.01\n",
"for": "1h", "for": "1h",
"labels": { "labels": {
"severity": "warning" "severity": "warning"
@ -1687,7 +1585,7 @@ data:
"message": "Clock on {{ $labels.instance }} is not synchronising. Ensure NTP is configured on this host.", "message": "Clock on {{ $labels.instance }} is not synchronising. Ensure NTP is configured on this host.",
"summary": "Clock not synchronising." "summary": "Clock not synchronising."
}, },
"expr": "min_over_time(node_timex_sync_status[5m]) == 0\n", "expr": "min_over_time(node_timex_sync_status[5m]) == 0\nand\nnode_timex_maxerror_seconds >= 16\n",
"for": "10m", "for": "10m",
"labels": { "labels": {
"severity": "warning" "severity": "warning"
@ -1762,18 +1660,6 @@ data:
"severity": "warning" "severity": "warning"
} }
}, },
{
"alert": "PrometheusErrorSendingAlertsToAnyAlertmanager",
"annotations": {
"description": "{{ printf \"%.1f\" $value }}% minimum errors while sending alerts from Prometheus {{$labels.instance}} to any Alertmanager.",
"summary": "Prometheus encounters more than 3% errors sending alerts to any Alertmanager."
},
"expr": "min without(alertmanager) (\n rate(prometheus_notifications_errors_total{job=\"prometheus\"}[5m])\n/\n rate(prometheus_notifications_sent_total{job=\"prometheus\"}[5m])\n)\n* 100\n> 3\n",
"for": "15m",
"labels": {
"severity": "critical"
}
},
{ {
"alert": "PrometheusNotConnectedToAlertmanagers", "alert": "PrometheusNotConnectedToAlertmanagers",
"annotations": { "annotations": {
@ -1816,7 +1702,7 @@ data:
"description": "Prometheus {{$labels.instance}} is not ingesting samples.", "description": "Prometheus {{$labels.instance}} is not ingesting samples.",
"summary": "Prometheus is not ingesting samples." "summary": "Prometheus is not ingesting samples."
}, },
"expr": "rate(prometheus_tsdb_head_samples_appended_total{job=\"prometheus\"}[5m]) <= 0\n", "expr": "(\n rate(prometheus_tsdb_head_samples_appended_total{job=\"prometheus\"}[5m]) <= 0\nand\n (\n sum without(scrape_job) (prometheus_target_metadata_cache_entries{job=\"prometheus\"}) > 0\n or\n sum without(rule_group) (prometheus_rule_group_rules{job=\"prometheus\"}) > 0\n )\n)\n",
"for": "10m", "for": "10m",
"labels": { "labels": {
"severity": "warning" "severity": "warning"
@ -1864,7 +1750,7 @@ data:
"description": "Prometheus {{$labels.instance}} remote write is {{ printf \"%.1f\" $value }}s behind for {{ $labels.remote_name}}:{{ $labels.url }}.", "description": "Prometheus {{$labels.instance}} remote write is {{ printf \"%.1f\" $value }}s behind for {{ $labels.remote_name}}:{{ $labels.url }}.",
"summary": "Prometheus remote write is behind." "summary": "Prometheus remote write is behind."
}, },
"expr": "# Without max_over_time, failed scrapes could create false negatives, see\n# https://www.robustperception.io/alerting-on-gauges-in-prometheus-2-0 for details.\n(\n max_over_time(prometheus_remote_storage_highest_timestamp_in_seconds{job=\"prometheus\"}[5m])\n- on(job, instance) group_right\n max_over_time(prometheus_remote_storage_queue_highest_sent_timestamp_seconds{job=\"prometheus\"}[5m])\n)\n> 120\n", "expr": "# Without max_over_time, failed scrapes could create false negatives, see\n# https://www.robustperception.io/alerting-on-gauges-in-prometheus-2-0 for details.\n(\n max_over_time(prometheus_remote_storage_highest_timestamp_in_seconds{job=\"prometheus\"}[5m])\n- ignoring(remote_name, url) group_right\n max_over_time(prometheus_remote_storage_queue_highest_sent_timestamp_seconds{job=\"prometheus\"}[5m])\n)\n> 120\n",
"for": "15m", "for": "15m",
"labels": { "labels": {
"severity": "critical" "severity": "critical"
@ -1917,6 +1803,18 @@ data:
"labels": { "labels": {
"severity": "warning" "severity": "warning"
} }
},
{
"alert": "PrometheusErrorSendingAlertsToAnyAlertmanager",
"annotations": {
"description": "{{ printf \"%.1f\" $value }}% minimum errors while sending alerts from Prometheus {{$labels.instance}} to any Alertmanager.",
"summary": "Prometheus encounters more than 3% errors sending alerts to any Alertmanager."
},
"expr": "min without (alertmanager) (\n rate(prometheus_notifications_errors_total{job=\"prometheus\",alertmanager!~``}[5m])\n/\n rate(prometheus_notifications_sent_total{job=\"prometheus\",alertmanager!~``}[5m])\n)\n* 100\n> 3\n",
"for": "15m",
"labels": {
"severity": "critical"
}
} }
] ]
} }

View File

@ -1,50 +0,0 @@
locals {
# Pick a CoreOS Container Linux derivative
# coreos-stable -> Container Linux AMI
# flatcar-stable -> Flatcar Linux AMI
ami_id = local.flavor == "flatcar" ? data.aws_ami.flatcar.image_id : data.aws_ami.coreos.image_id
flavor = split("-", var.os_image)[0]
channel = split("-", var.os_image)[1]
}
data "aws_ami" "coreos" {
most_recent = true
owners = ["595879546273"]
filter {
name = "architecture"
values = ["x86_64"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
filter {
name = "name"
values = ["CoreOS-${local.flavor == "coreos" ? local.channel : "stable"}-*"]
}
}
data "aws_ami" "flatcar" {
most_recent = true
owners = ["075585003325"]
filter {
name = "architecture"
values = ["x86_64"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
filter {
name = "name"
values = ["Flatcar-${local.flavor == "flatcar" ? local.channel : "stable"}-*"]
}
}

View File

@ -1,205 +0,0 @@
---
systemd:
units:
- name: etcd-member.service
enabled: true
dropins:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.4.12"
Environment="ETCD_IMAGE_URL=docker://quay.io/coreos/etcd"
Environment="RKT_RUN_ARGS=--insecure-options=image"
Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
Environment="ETCD_LISTEN_CLIENT_URLS=https://0.0.0.0:2379"
Environment="ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380"
Environment="ETCD_LISTEN_METRICS_URLS=http://0.0.0.0:2381"
Environment="ETCD_INITIAL_CLUSTER=${etcd_initial_cluster}"
Environment="ETCD_STRICT_RECONFIG_CHECK=true"
Environment="ETCD_SSL_DIR=/etc/ssl/etcd"
Environment="ETCD_TRUSTED_CA_FILE=/etc/ssl/certs/etcd/server-ca.crt"
Environment="ETCD_CERT_FILE=/etc/ssl/certs/etcd/server.crt"
Environment="ETCD_KEY_FILE=/etc/ssl/certs/etcd/server.key"
Environment="ETCD_CLIENT_CERT_AUTH=true"
Environment="ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/certs/etcd/peer-ca.crt"
Environment="ETCD_PEER_CERT_FILE=/etc/ssl/certs/etcd/peer.crt"
Environment="ETCD_PEER_KEY_FILE=/etc/ssl/certs/etcd/peer.key"
Environment="ETCD_PEER_CLIENT_CERT_AUTH=true"
- name: docker.service
enabled: true
- name: locksmithd.service
mask: true
- name: wait-for-dns.service
enabled: true
contents: |
[Unit]
Description=Wait for DNS entries
Wants=systemd-resolved.service
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/sh -c 'while ! /usr/bin/grep '^[^#[:space:]]' /etc/resolv.conf > /dev/null; do sleep 1; done'
[Install]
RequiredBy=kubelet.service
RequiredBy=etcd-member.service
- name: kubelet.service
enabled: true
contents: |
[Unit]
Description=Kubelet
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.19.2
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/bin/rkt run \
--uuid-file-save=/var/cache/kubelet-pod.uuid \
--stage1-from-dir=stage1-fly.aci \
--hosts-entry host \
--insecure-options=image \
--volume etc-kubernetes,kind=host,source=/etc/kubernetes,readOnly=true \
--mount volume=etc-kubernetes,target=/etc/kubernetes \
--volume etc-machine-id,kind=host,source=/etc/machine-id,readOnly=true \
--mount volume=etc-machine-id,target=/etc/machine-id \
--volume etc-os-release,kind=host,source=/usr/lib/os-release,readOnly=true \
--mount volume=etc-os-release,target=/etc/os-release \
--volume=etc-resolv,kind=host,source=/etc/resolv.conf,readOnly=true \
--mount volume=etc-resolv,target=/etc/resolv.conf \
--volume etc-ssl-certs,kind=host,source=/etc/ssl/certs,readOnly=true \
--mount volume=etc-ssl-certs,target=/etc/ssl/certs \
--volume lib-modules,kind=host,source=/lib/modules,readOnly=true \
--mount volume=lib-modules,target=/lib/modules \
--volume run,kind=host,source=/run \
--mount volume=run,target=/run \
--volume usr-share-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true \
--mount volume=usr-share-certs,target=/usr/share/ca-certificates \
--volume var-lib-calico,kind=host,source=/var/lib/calico,readOnly=true \
--mount volume=var-lib-calico,target=/var/lib/calico \
--volume var-lib-docker,kind=host,source=/var/lib/docker \
--mount volume=var-lib-docker,target=/var/lib/docker \
--volume var-lib-kubelet,kind=host,source=/var/lib/kubelet,recursive=true \
--mount volume=var-lib-kubelet,target=/var/lib/kubelet \
--volume var-log,kind=host,source=/var/log \
--mount volume=var-log,target=/var/log \
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
--mount volume=opt-cni-bin,target=/opt/cni/bin \
$${KUBELET_IMAGE} -- \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
- name: bootstrap.service
contents: |
[Unit]
Description=Kubernetes control plane
ConditionPathExists=!/opt/bootstrap/bootstrap.done
[Service]
Type=oneshot
RemainAfterExit=true
WorkingDirectory=/opt/bootstrap
ExecStart=/usr/bin/rkt run \
--trust-keys-from-https \
--volume config,kind=host,source=/etc/kubernetes/bootstrap-secrets \
--mount volume=config,target=/etc/kubernetes/secrets \
--volume assets,kind=host,source=/opt/bootstrap/assets \
--mount volume=assets,target=/assets \
--volume script,kind=host,source=/opt/bootstrap/apply \
--mount volume=script,target=/apply \
--insecure-options=image \
docker://quay.io/poseidon/kubelet:v1.19.2 \
--net=host \
--dns=host \
--exec=/apply
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
[Install]
WantedBy=multi-user.target
storage:
directories:
- path: /var/lib/etcd
filesystem: root
mode: 0700
overwrite: true
files:
- path: /etc/kubernetes/kubeconfig
filesystem: root
mode: 0644
contents:
inline: |
${kubeconfig}
- path: /opt/bootstrap/layout
filesystem: root
mode: 0544
contents:
inline: |
#!/bin/bash -e
mkdir -p -- auth tls/etcd tls/k8s static-manifests manifests/coredns manifests-networking
awk '/#####/ {filename=$2; next} {print > filename}' assets
mkdir -p /etc/ssl/etcd/etcd
mkdir -p /etc/kubernetes/bootstrap-secrets
mv tls/etcd/{peer*,server*} /etc/ssl/etcd/etcd/
mv tls/etcd/etcd-client* /etc/kubernetes/bootstrap-secrets/
chown -R etcd:etcd /etc/ssl/etcd
chmod -R 500 /etc/ssl/etcd
chmod -R 700 /var/lib/etcd
mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/
mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/
mkdir -p /etc/kubernetes/manifests
mv static-manifests/* /etc/kubernetes/manifests/
mkdir -p /opt/bootstrap/assets
mv manifests /opt/bootstrap/assets/manifests
mv manifests-networking/* /opt/bootstrap/assets/manifests/
rm -rf assets auth static-manifests tls manifests-networking
- path: /opt/bootstrap/apply
filesystem: root
mode: 0544
contents:
inline: |
#!/bin/bash -e
export KUBECONFIG=/etc/kubernetes/secrets/kubeconfig
until kubectl version; do
echo "Waiting for static pod control plane"
sleep 5
done
until kubectl apply -f /assets/manifests -R; do
echo "Retry applying manifests"
sleep 5
done
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
mode: 0644
contents:
inline: |
fs.inotify.max_user_watches=16184
passwd:
users:
- name: core
ssh_authorized_keys:
- "${ssh_authorized_key}"

View File

@ -1,50 +0,0 @@
locals {
# Pick a CoreOS Container Linux derivative
# coreos-stable -> Container Linux AMI
# flatcar-stable -> Flatcar Linux AMI
ami_id = local.flavor == "flatcar" ? data.aws_ami.flatcar.image_id : data.aws_ami.coreos.image_id
flavor = split("-", var.os_image)[0]
channel = split("-", var.os_image)[1]
}
data "aws_ami" "coreos" {
most_recent = true
owners = ["595879546273"]
filter {
name = "architecture"
values = ["x86_64"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
filter {
name = "name"
values = ["CoreOS-${local.flavor == "coreos" ? local.channel : "stable"}-*"]
}
}
data "aws_ami" "flatcar" {
most_recent = true
owners = ["075585003325"]
filter {
name = "architecture"
values = ["x86_64"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
filter {
name = "name"
values = ["Flatcar-${local.flavor == "flatcar" ? local.channel : "stable"}-*"]
}
}

View File

@ -1,140 +0,0 @@
---
systemd:
units:
- name: docker.service
enabled: true
- name: locksmithd.service
mask: true
- name: wait-for-dns.service
enabled: true
contents: |
[Unit]
Description=Wait for DNS entries
Wants=systemd-resolved.service
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/sh -c 'while ! /usr/bin/grep '^[^#[:space:]]' /etc/resolv.conf > /dev/null; do sleep 1; done'
[Install]
RequiredBy=kubelet.service
- name: kubelet.service
enabled: true
contents: |
[Unit]
Description=Kubelet
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.19.2
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/bin/rkt run \
--uuid-file-save=/var/cache/kubelet-pod.uuid \
--stage1-from-dir=stage1-fly.aci \
--hosts-entry host \
--insecure-options=image \
--volume etc-kubernetes,kind=host,source=/etc/kubernetes,readOnly=true \
--mount volume=etc-kubernetes,target=/etc/kubernetes \
--volume etc-machine-id,kind=host,source=/etc/machine-id,readOnly=true \
--mount volume=etc-machine-id,target=/etc/machine-id \
--volume etc-os-release,kind=host,source=/usr/lib/os-release,readOnly=true \
--mount volume=etc-os-release,target=/etc/os-release \
--volume=etc-resolv,kind=host,source=/etc/resolv.conf,readOnly=true \
--mount volume=etc-resolv,target=/etc/resolv.conf \
--volume etc-ssl-certs,kind=host,source=/etc/ssl/certs,readOnly=true \
--mount volume=etc-ssl-certs,target=/etc/ssl/certs \
--volume lib-modules,kind=host,source=/lib/modules,readOnly=true \
--mount volume=lib-modules,target=/lib/modules \
--volume run,kind=host,source=/run \
--mount volume=run,target=/run \
--volume usr-share-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true \
--mount volume=usr-share-certs,target=/usr/share/ca-certificates \
--volume var-lib-calico,kind=host,source=/var/lib/calico,readOnly=true \
--mount volume=var-lib-calico,target=/var/lib/calico \
--volume var-lib-docker,kind=host,source=/var/lib/docker \
--mount volume=var-lib-docker,target=/var/lib/docker \
--volume var-lib-kubelet,kind=host,source=/var/lib/kubelet,recursive=true \
--mount volume=var-lib-kubelet,target=/var/lib/kubelet \
--volume var-log,kind=host,source=/var/log \
--mount volume=var-log,target=/var/log \
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
--mount volume=opt-cni-bin,target=/opt/cni/bin \
$${KUBELET_IMAGE} -- \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/node \
%{~ for label in split(",", node_labels) ~}
--node-labels=${label} \
%{~ endfor ~}
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
- name: delete-node.service
enable: true
contents: |
[Unit]
Description=Waiting to delete Kubernetes node on shutdown
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/true
ExecStop=/etc/kubernetes/delete-node
[Install]
WantedBy=multi-user.target
storage:
files:
- path: /etc/kubernetes/kubeconfig
filesystem: root
mode: 0644
contents:
inline: |
${kubeconfig}
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
mode: 0644
contents:
inline: |
fs.inotify.max_user_watches=16184
- path: /etc/kubernetes/delete-node
filesystem: root
mode: 0744
contents:
inline: |
#!/bin/bash
set -e
exec /usr/bin/rkt run \
--trust-keys-from-https \
--volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \
--insecure-options=image \
docker://quay.io/poseidon/kubelet:v1.19.2 \
--net=host \
--dns=host \
--exec=/usr/local/bin/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)
passwd:
users:
- name: core
ssh_authorized_keys:
- "${ssh_authorized_key}"

View File

@ -11,10 +11,10 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a> ## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.19.2 (upstream) * Kubernetes v1.21.1 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking * Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/cl/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization * Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/fedora-coreos/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization
* Ready for Ingress, Prometheus, Grafana, CSI, and other optional [addons](https://typhoon.psdn.io/addons/overview/) * Ready for Ingress, Prometheus, Grafana, CSI, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
## Docs ## Docs

View File

@ -18,3 +18,29 @@ data "aws_ami" "fedora-coreos" {
values = ["Fedora CoreOS ${var.os_stream} *"] values = ["Fedora CoreOS ${var.os_stream} *"]
} }
} }
# Experimental Fedora CoreOS arm64 / aarch64 AMIs from Poseidon
# WARNING: These AMIs will be removed when Fedora CoreOS publishes arm64 AMIs
# and may be removed for any reason before then as well. Do not use.
data "aws_ami" "fedora-coreos-arm" {
count = var.arch == "arm64" ? 1 : 0
most_recent = true
owners = ["099663496933"]
filter {
name = "architecture"
values = ["arm64"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
filter {
name = "name"
values = ["fedora-coreos-*"]
}
}

View File

@ -1,11 +1,10 @@
# Kubernetes assets (kubeconfig, manifests) # Kubernetes assets (kubeconfig, manifests)
module "bootstrap" { module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d0f2123c5971410dc14aecde2307eb13e89c2bdf" source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=ebe3d5526a59b34c8f119a206358b0c0a6f6f67d"
cluster_name = var.cluster_name cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)] api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
etcd_servers = aws_route53_record.etcds.*.fqdn etcd_servers = aws_route53_record.etcds.*.fqdn
asset_dir = var.asset_dir
networking = var.networking networking = var.networking
network_mtu = var.network_mtu network_mtu = var.network_mtu
pod_cidr = var.pod_cidr pod_cidr = var.pod_cidr
@ -13,6 +12,7 @@ module "bootstrap" {
cluster_domain_suffix = var.cluster_domain_suffix cluster_domain_suffix = var.cluster_domain_suffix
enable_reporting = var.enable_reporting enable_reporting = var.enable_reporting
enable_aggregation = var.enable_aggregation enable_aggregation = var.enable_aggregation
daemonset_tolerations = var.daemonset_tolerations
trusted_certs_dir = "/etc/pki/tls/certs" trusted_certs_dir = "/etc/pki/tls/certs"
} }

View File

@ -22,9 +22,8 @@ resource "aws_instance" "controllers" {
} }
instance_type = var.controller_type instance_type = var.controller_type
ami = var.arch == "arm64" ? data.aws_ami.fedora-coreos-arm[0].image_id : data.aws_ami.fedora-coreos.image_id
ami = data.aws_ami.fedora-coreos.image_id user_data = data.ct_config.controller-ignitions.*.rendered[count.index]
user_data = data.ct_config.controller-ignitions.*.rendered[count.index]
# storage # storage
root_block_device { root_block_device {
@ -63,6 +62,7 @@ data "template_file" "controller-configs" {
vars = { vars = {
# Cannot use cyclic dependencies on controllers or their DNS records # Cannot use cyclic dependencies on controllers or their DNS records
etcd_arch = var.arch == "arm64" ? "-arm64" : ""
etcd_name = "etcd${count.index}" etcd_name = "etcd${count.index}"
etcd_domain = "${var.cluster_name}-etcd${count.index}.${var.dns_zone}" etcd_domain = "${var.cluster_name}-etcd${count.index}.${var.dns_zone}"
# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,... # etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...

View File

@ -1,6 +1,6 @@
--- ---
variant: fcos variant: fcos
version: 1.1.0 version: 1.2.0
systemd: systemd:
units: units:
- name: etcd-member.service - name: etcd-member.service
@ -8,28 +8,25 @@ systemd:
contents: | contents: |
[Unit] [Unit]
Description=etcd (System Container) Description=etcd (System Container)
Documentation=https://github.com/coreos/etcd Documentation=https://github.com/etcd-io/etcd
Wants=network-online.target network.target Wants=network-online.target network.target
After=network-online.target After=network-online.target
[Service] [Service]
# https://github.com/opencontainers/runc/pull/1807 Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.4.16${etcd_arch}
# Type=notify
# NotifyAccess=exec
Type=exec Type=exec
Restart=on-failure
RestartSec=10s
TimeoutStartSec=0
LimitNOFILE=40000
ExecStartPre=/bin/mkdir -p /var/lib/etcd ExecStartPre=/bin/mkdir -p /var/lib/etcd
ExecStartPre=-/usr/bin/podman rm etcd ExecStartPre=-/usr/bin/podman rm etcd
#--volume $${NOTIFY_SOCKET}:/run/systemd/notify \
ExecStart=/usr/bin/podman run --name etcd \ ExecStart=/usr/bin/podman run --name etcd \
--env-file /etc/etcd/etcd.env \ --env-file /etc/etcd/etcd.env \
--network host \ --network host \
--volume /var/lib/etcd:/var/lib/etcd:rw,Z \ --volume /var/lib/etcd:/var/lib/etcd:rw,Z \
--volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \ --volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \
quay.io/coreos/etcd:v3.4.12 $${ETCD_IMAGE}
ExecStop=/usr/bin/podman stop etcd ExecStop=/usr/bin/podman stop etcd
Restart=on-failure
RestartSec=10s
TimeoutStartSec=0
LimitNOFILE=40000
[Install] [Install]
WantedBy=multi-user.target WantedBy=multi-user.target
- name: docker.service - name: docker.service
@ -53,10 +50,13 @@ systemd:
contents: | contents: |
[Unit] [Unit]
Description=Kubelet (System Container) Description=Kubelet (System Container)
Requires=afterburn.service
After=afterburn.service
Wants=rpc-statd.service Wants=rpc-statd.service
[Service] [Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.19.2 Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.21.1
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d EnvironmentFile=/run/metadata/afterburn
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico ExecStartPre=/bin/mkdir -p /var/lib/calico
@ -67,14 +67,12 @@ systemd:
--privileged \ --privileged \
--pid host \ --pid host \
--network host \ --network host \
--volume /etc/cni/net.d:/etc/cni/net.d:ro,z \
--volume /etc/kubernetes:/etc/kubernetes:ro,z \ --volume /etc/kubernetes:/etc/kubernetes:ro,z \
--volume /usr/lib/os-release:/etc/os-release:ro \ --volume /usr/lib/os-release:/etc/os-release:ro \
--volume /etc/ssl/certs:/etc/ssl/certs:ro \
--volume /lib/modules:/lib/modules:ro \ --volume /lib/modules:/lib/modules:ro \
--volume /run:/run \ --volume /run:/run \
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \ --volume /sys/fs/cgroup:/sys/fs/cgroup \
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
--volume /etc/pki/tls/certs:/usr/share/ca-certificates:ro \
--volume /var/lib/calico:/var/lib/calico:ro \ --volume /var/lib/calico:/var/lib/calico:ro \
--volume /var/lib/docker:/var/lib/docker \ --volume /var/lib/docker:/var/lib/docker \
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \ --volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
@ -92,12 +90,12 @@ systemd:
--client-ca-file=/etc/kubernetes/ca.crt \ --client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \ --cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \ --cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \ --healthz-port=0 \
--kubeconfig=/var/lib/kubelet/kubeconfig \ --kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \ --network-plugin=cni \
--node-labels=node.kubernetes.io/controller="true" \ --node-labels=node.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \ --pod-manifest-path=/etc/kubernetes/manifests \
--provider-id=aws:///$${AFTERBURN_AWS_AVAILABILITY_ZONE}/$${AFTERBURN_AWS_INSTANCE_ID} \
--read-only-port=0 \ --read-only-port=0 \
--register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \ --register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \
--rotate-certificates \ --rotate-certificates \
@ -120,11 +118,11 @@ systemd:
ExecStartPre=-/usr/bin/podman rm bootstrap ExecStartPre=-/usr/bin/podman rm bootstrap
ExecStart=/usr/bin/podman run --name bootstrap \ ExecStart=/usr/bin/podman run --name bootstrap \
--network host \ --network host \
--volume /etc/kubernetes/bootstrap-secrets:/etc/kubernetes/secrets:ro,z \ --volume /etc/kubernetes/pki:/etc/kubernetes/pki:ro,z \
--volume /opt/bootstrap/assets:/assets:ro,Z \ --volume /opt/bootstrap/assets:/assets:ro,Z \
--volume /opt/bootstrap/apply:/apply:ro,Z \ --volume /opt/bootstrap/apply:/apply:ro,Z \
--entrypoint=/apply \ --entrypoint=/apply \
quay.io/poseidon/kubelet:v1.19.2 quay.io/poseidon/kubelet:v1.21.1
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
ExecStartPost=-/usr/bin/podman stop bootstrap ExecStartPost=-/usr/bin/podman stop bootstrap
storage: storage:
@ -147,26 +145,26 @@ storage:
mkdir -p -- auth tls/etcd tls/k8s static-manifests manifests/coredns manifests-networking mkdir -p -- auth tls/etcd tls/k8s static-manifests manifests/coredns manifests-networking
awk '/#####/ {filename=$2; next} {print > filename}' assets awk '/#####/ {filename=$2; next} {print > filename}' assets
mkdir -p /etc/ssl/etcd/etcd mkdir -p /etc/ssl/etcd/etcd
mkdir -p /etc/kubernetes/bootstrap-secrets mkdir -p /etc/kubernetes/pki
mv tls/etcd/{peer*,server*} /etc/ssl/etcd/etcd/ mv tls/etcd/{peer*,server*} /etc/ssl/etcd/etcd/
mv tls/etcd/etcd-client* /etc/kubernetes/bootstrap-secrets/ mv tls/etcd/etcd-client* /etc/kubernetes/pki/
chown -R etcd:etcd /etc/ssl/etcd chown -R etcd:etcd /etc/ssl/etcd
chmod -R 500 /etc/ssl/etcd chmod -R 500 /etc/ssl/etcd
mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/ mv auth/* /etc/kubernetes/pki/
mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/ mv tls/k8s/* /etc/kubernetes/pki/
mkdir -p /etc/kubernetes/manifests mkdir -p /etc/kubernetes/manifests
mv static-manifests/* /etc/kubernetes/manifests/ mv static-manifests/* /etc/kubernetes/manifests/
mkdir -p /opt/bootstrap/assets mkdir -p /opt/bootstrap/assets
mv manifests /opt/bootstrap/assets/manifests mv manifests /opt/bootstrap/assets/manifests
mv manifests-networking/* /opt/bootstrap/assets/manifests/ mv manifests-networking/* /opt/bootstrap/assets/manifests/
rm -rf assets auth static-manifests tls manifests-networking rm -rf assets auth static-manifests tls manifests-networking
chcon -R -u system_u -t container_file_t /etc/kubernetes/bootstrap-secrets chcon -R -u system_u -t container_file_t /etc/kubernetes/pki
- path: /opt/bootstrap/apply - path: /opt/bootstrap/apply
mode: 0544 mode: 0544
contents: contents:
inline: | inline: |
#!/bin/bash -e #!/bin/bash -e
export KUBECONFIG=/etc/kubernetes/secrets/kubeconfig export KUBECONFIG=/etc/kubernetes/pki/admin.conf
until kubectl version; do until kubectl version; do
echo "Waiting for static pod control plane" echo "Waiting for static pod control plane"
sleep 5 sleep 5
@ -202,8 +200,6 @@ storage:
mode: 0644 mode: 0644
contents: contents:
inline: | inline: |
# TODO: Use a systemd dropin once podman v1.4.5 is avail.
NOTIFY_SOCKET=/run/systemd/notify
ETCD_NAME=${etcd_name} ETCD_NAME=${etcd_name}
ETCD_DATA_DIR=/var/lib/etcd ETCD_DATA_DIR=/var/lib/etcd
ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379 ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379
@ -221,6 +217,7 @@ storage:
ETCD_PEER_CERT_FILE=/etc/ssl/certs/etcd/peer.crt ETCD_PEER_CERT_FILE=/etc/ssl/certs/etcd/peer.crt
ETCD_PEER_KEY_FILE=/etc/ssl/certs/etcd/peer.key ETCD_PEER_KEY_FILE=/etc/ssl/certs/etcd/peer.key
ETCD_PEER_CLIENT_CERT_AUTH=true ETCD_PEER_CLIENT_CERT_AUTH=true
ETCD_UNSUPPORTED_ARCH=arm64
passwd: passwd:
users: users:
- name: core - name: core

View File

@ -17,6 +17,7 @@ resource "aws_route53_record" "apiserver" {
resource "aws_lb" "nlb" { resource "aws_lb" "nlb" {
name = "${var.cluster_name}-nlb" name = "${var.cluster_name}-nlb"
load_balancer_type = "network" load_balancer_type = "network"
ip_address_type = "dualstack"
internal = false internal = false
subnets = aws_subnet.public.*.id subnets = aws_subnet.public.*.id

View File

@ -1,5 +1,6 @@
output "kubeconfig-admin" { output "kubeconfig-admin" {
value = module.bootstrap.kubeconfig-admin value = module.bootstrap.kubeconfig-admin
sensitive = true
} }
# Outputs for Kubernetes Ingress # Outputs for Kubernetes Ingress
@ -32,7 +33,8 @@ output "worker_security_groups" {
} }
output "kubeconfig" { output "kubeconfig" {
value = module.bootstrap.kubeconfig-kubelet value = module.bootstrap.kubeconfig-kubelet
sensitive = true
} }
# Outputs for custom load balancing # Outputs for custom load balancing
@ -52,3 +54,10 @@ output "worker_target_group_https" {
value = module.workers.target_group_https value = module.workers.target_group_https
} }
# Outputs for debug
output "assets_dist" {
value = module.bootstrap.assets_dist
sensitive = true
}

View File

@ -43,14 +43,19 @@ variable "worker_type" {
variable "os_stream" { variable "os_stream" {
type = string type = string
description = "Fedora CoreOs image stream for instances (e.g. stable, testing, next)" description = "Fedora CoreOS image stream for instances (e.g. stable, testing, next)"
default = "stable" default = "stable"
validation {
condition = contains(["stable", "testing", "next"], var.os_stream)
error_message = "The os_stream must be stable, testing, or next."
}
} }
variable "disk_size" { variable "disk_size" {
type = number type = number
description = "Size of the EBS volume in GB" description = "Size of the EBS volume in GB"
default = 40 default = 30
} }
variable "disk_type" { variable "disk_type" {
@ -96,12 +101,6 @@ variable "ssh_authorized_key" {
description = "SSH public key for user 'core'" description = "SSH public key for user 'core'"
} }
variable "asset_dir" {
type = string
description = "Absolute path to a directory where generated assets should be placed (contains secrets)"
default = ""
}
variable "networking" { variable "networking" {
type = string type = string
description = "Choice of networking provider (calico or flannel)" description = "Choice of networking provider (calico or flannel)"
@ -161,3 +160,19 @@ variable "cluster_domain_suffix" {
default = "cluster.local" default = "cluster.local"
} }
variable "arch" {
type = string
description = "Container architecture (amd64 or arm64)"
default = "amd64"
validation {
condition = var.arch == "amd64" || var.arch == "arm64"
error_message = "The arch must be amd64 or arm64."
}
}
variable "daemonset_tolerations" {
type = list(string)
description = "List of additional taint keys kube-system DaemonSets should tolerate (e.g. ['custom-role', 'gpu-role'])"
default = []
}

View File

@ -1,7 +1,7 @@
# Terraform version and plugin versions # Terraform version and plugin versions
terraform { terraform {
required_version = ">= 0.12.26, < 0.14.0" required_version = ">= 0.13.0, < 0.16.0"
required_providers { required_providers {
aws = ">= 2.23, <= 4.0" aws = ">= 2.23, <= 4.0"
template = "~> 2.1" template = "~> 2.1"
@ -9,7 +9,7 @@ terraform {
ct = { ct = {
source = "poseidon/ct" source = "poseidon/ct"
version = "~> 0.6.1" version = "~> 0.8"
} }
} }
} }

View File

@ -9,6 +9,7 @@ module "workers" {
worker_count = var.worker_count worker_count = var.worker_count
instance_type = var.worker_type instance_type = var.worker_type
os_stream = var.os_stream os_stream = var.os_stream
arch = var.arch
disk_size = var.disk_size disk_size = var.disk_size
spot_price = var.worker_price spot_price = var.worker_price
target_groups = var.worker_target_groups target_groups = var.worker_target_groups

View File

@ -18,3 +18,29 @@ data "aws_ami" "fedora-coreos" {
values = ["Fedora CoreOS ${var.os_stream} *"] values = ["Fedora CoreOS ${var.os_stream} *"]
} }
} }
# Experimental Fedora CoreOS arm64 / aarch64 AMIs from Poseidon
# WARNING: These AMIs will be removed when Fedora CoreOS publishes arm64 AMIs
# and may be removed for any reason before then as well. Do not use.
data "aws_ami" "fedora-coreos-arm" {
count = var.arch == "arm64" ? 1 : 0
most_recent = true
owners = ["099663496933"]
filter {
name = "architecture"
values = ["arm64"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
filter {
name = "name"
values = ["fedora-coreos-*"]
}
}

View File

@ -1,6 +1,6 @@
--- ---
variant: fcos variant: fcos
version: 1.1.0 version: 1.2.0
systemd: systemd:
units: units:
- name: docker.service - name: docker.service
@ -23,10 +23,13 @@ systemd:
contents: | contents: |
[Unit] [Unit]
Description=Kubelet (System Container) Description=Kubelet (System Container)
Requires=afterburn.service
After=afterburn.service
Wants=rpc-statd.service Wants=rpc-statd.service
[Service] [Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.19.2 Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.21.1
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d EnvironmentFile=/run/metadata/afterburn
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico ExecStartPre=/bin/mkdir -p /var/lib/calico
@ -37,14 +40,12 @@ systemd:
--privileged \ --privileged \
--pid host \ --pid host \
--network host \ --network host \
--volume /etc/cni/net.d:/etc/cni/net.d:ro,z \
--volume /etc/kubernetes:/etc/kubernetes:ro,z \ --volume /etc/kubernetes:/etc/kubernetes:ro,z \
--volume /usr/lib/os-release:/etc/os-release:ro \ --volume /usr/lib/os-release:/etc/os-release:ro \
--volume /etc/ssl/certs:/etc/ssl/certs:ro \
--volume /lib/modules:/lib/modules:ro \ --volume /lib/modules:/lib/modules:ro \
--volume /run:/run \ --volume /run:/run \
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \ --volume /sys/fs/cgroup:/sys/fs/cgroup \
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
--volume /etc/pki/tls/certs:/usr/share/ca-certificates:ro \
--volume /var/lib/calico:/var/lib/calico:ro \ --volume /var/lib/calico:/var/lib/calico:ro \
--volume /var/lib/docker:/var/lib/docker \ --volume /var/lib/docker:/var/lib/docker \
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \ --volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
@ -62,7 +63,6 @@ systemd:
--client-ca-file=/etc/kubernetes/ca.crt \ --client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \ --cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \ --cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \ --healthz-port=0 \
--kubeconfig=/var/lib/kubelet/kubeconfig \ --kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \ --network-plugin=cni \
@ -70,7 +70,11 @@ systemd:
%{~ for label in split(",", node_labels) ~} %{~ for label in split(",", node_labels) ~}
--node-labels=${label} \ --node-labels=${label} \
%{~ endfor ~} %{~ endfor ~}
%{~ for taint in split(",", node_taints) ~}
--register-with-taints=${taint} \
%{~ endfor ~}
--pod-manifest-path=/etc/kubernetes/manifests \ --pod-manifest-path=/etc/kubernetes/manifests \
--provider-id=aws:///$${AFTERBURN_AWS_AVAILABILITY_ZONE}/$${AFTERBURN_AWS_INSTANCE_ID} \
--read-only-port=0 \ --read-only-port=0 \
--rotate-certificates \ --rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins --volume-plugin-dir=/var/lib/kubelet/volumeplugins
@ -86,10 +90,11 @@ systemd:
[Unit] [Unit]
Description=Delete Kubernetes node on shutdown Description=Delete Kubernetes node on shutdown
[Service] [Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.21.1
Type=oneshot Type=oneshot
RemainAfterExit=true RemainAfterExit=true
ExecStart=/bin/true ExecStart=/bin/true
ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.19.2 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME' ExecStop=/bin/bash -c '/usr/bin/podman run --volume /var/lib/kubelet:/var/lib/kubelet:ro,z --entrypoint /usr/local/bin/kubectl $${KUBELET_IMAGE} --kubeconfig=/var/lib/kubelet/kubeconfig delete node $HOSTNAME'
[Install] [Install]
WantedBy=multi-user.target WantedBy=multi-user.target
storage: storage:

View File

@ -36,14 +36,19 @@ variable "instance_type" {
variable "os_stream" { variable "os_stream" {
type = string type = string
description = "Fedora CoreOs image stream for instances (e.g. stable, testing, next)" description = "Fedora CoreOS image stream for instances (e.g. stable, testing, next)"
default = "stable" default = "stable"
validation {
condition = contains(["stable", "testing", "next"], var.os_stream)
error_message = "The os_stream must be stable, testing, or next."
}
} }
variable "disk_size" { variable "disk_size" {
type = number type = number
description = "Size of the EBS volume in GB" description = "Size of the EBS volume in GB"
default = 40 default = 30
} }
variable "disk_type" { variable "disk_type" {
@ -108,3 +113,22 @@ variable "node_labels" {
description = "List of initial node labels" description = "List of initial node labels"
default = [] default = []
} }
variable "node_taints" {
type = list(string)
description = "List of initial node taints"
default = []
}
# unofficial, undocumented, unsupported
variable "arch" {
type = string
description = "Container architecture (amd64 or arm64)"
default = "amd64"
validation {
condition = var.arch == "amd64" || var.arch == "arm64"
error_message = "The arch must be amd64 or arm64."
}
}

View File

@ -1,14 +1,14 @@
# Terraform version and plugin versions # Terraform version and plugin versions
terraform { terraform {
required_version = ">= 0.12.26, < 0.14.0" required_version = ">= 0.13.0, < 0.16.0"
required_providers { required_providers {
aws = ">= 2.23, <= 4.0" aws = ">= 2.23, <= 4.0"
template = "~> 2.1" template = "~> 2.1"
ct = { ct = {
source = "poseidon/ct" source = "poseidon/ct"
version = "~> 0.6.1" version = "~> 0.8"
} }
} }
} }

View File

@ -44,7 +44,7 @@ resource "aws_autoscaling_group" "workers" {
# Worker template # Worker template
resource "aws_launch_configuration" "worker" { resource "aws_launch_configuration" "worker" {
image_id = data.aws_ami.fedora-coreos.image_id image_id = var.arch == "arm64" ? data.aws_ami.fedora-coreos-arm[0].image_id : data.aws_ami.fedora-coreos.image_id
instance_type = var.instance_type instance_type = var.instance_type
spot_price = var.spot_price > 0 ? var.spot_price : null spot_price = var.spot_price > 0 ? var.spot_price : null
enable_monitoring = false enable_monitoring = false
@ -86,6 +86,7 @@ data "template_file" "worker-config" {
cluster_dns_service_ip = cidrhost(var.service_cidr, 10) cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix cluster_domain_suffix = var.cluster_domain_suffix
node_labels = join(",", var.node_labels) node_labels = join(",", var.node_labels)
node_taints = join(",", var.node_taints)
} }
} }

View File

@ -11,13 +11,13 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a> ## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.19.2 (upstream) * Kubernetes v1.21.1 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking * Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/cl/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization * Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/flatcar-linux/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization
* Ready for Ingress, Prometheus, Grafana, CSI, and other optional [addons](https://typhoon.psdn.io/addons/overview/) * Ready for Ingress, Prometheus, Grafana, CSI, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
## Docs ## Docs
Please see the [official docs](https://typhoon.psdn.io) and the AWS [tutorial](https://typhoon.psdn.io/cl/aws/). Please see the [official docs](https://typhoon.psdn.io) and the AWS [tutorial](https://typhoon.psdn.io/flatcar-linux/aws/).

View File

@ -0,0 +1,27 @@
locals {
# Pick a Flatcar Linux AMI
# flatcar-stable -> Flatcar Linux AMI
ami_id = data.aws_ami.flatcar.image_id
channel = split("-", var.os_image)[1]
}
data "aws_ami" "flatcar" {
most_recent = true
owners = ["075585003325"]
filter {
name = "architecture"
values = ["x86_64"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
filter {
name = "name"
values = ["Flatcar-${local.channel}-*"]
}
}

View File

@ -1,11 +1,10 @@
# Kubernetes assets (kubeconfig, manifests) # Kubernetes assets (kubeconfig, manifests)
module "bootstrap" { module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d0f2123c5971410dc14aecde2307eb13e89c2bdf" source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=ebe3d5526a59b34c8f119a206358b0c0a6f6f67d"
cluster_name = var.cluster_name cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)] api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
etcd_servers = aws_route53_record.etcds.*.fqdn etcd_servers = aws_route53_record.etcds.*.fqdn
asset_dir = var.asset_dir
networking = var.networking networking = var.networking
network_mtu = var.network_mtu network_mtu = var.network_mtu
pod_cidr = var.pod_cidr pod_cidr = var.pod_cidr
@ -13,5 +12,6 @@ module "bootstrap" {
cluster_domain_suffix = var.cluster_domain_suffix cluster_domain_suffix = var.cluster_domain_suffix
enable_reporting = var.enable_reporting enable_reporting = var.enable_reporting
enable_aggregation = var.enable_aggregation enable_aggregation = var.enable_aggregation
daemonset_tolerations = var.daemonset_tolerations
} }

View File

@ -0,0 +1,215 @@
---
systemd:
units:
- name: etcd-member.service
enabled: true
contents: |
[Unit]
Description=etcd (System Container)
Documentation=https://github.com/etcd-io/etcd
Requires=docker.service
After=docker.service
[Service]
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.4.16
ExecStartPre=/usr/bin/docker run -d \
--name etcd \
--network host \
--env-file /etc/etcd/etcd.env \
--user 232:232 \
--volume /etc/ssl/etcd:/etc/ssl/certs:ro \
--volume /var/lib/etcd:/var/lib/etcd:rw \
$${ETCD_IMAGE}
ExecStart=docker logs -f etcd
ExecStop=docker stop etcd
ExecStopPost=docker rm etcd
Restart=always
RestartSec=10s
TimeoutStartSec=0
LimitNOFILE=40000
[Install]
WantedBy=multi-user.target
- name: docker.service
enabled: true
- name: locksmithd.service
mask: true
- name: wait-for-dns.service
enabled: true
contents: |
[Unit]
Description=Wait for DNS entries
Wants=systemd-resolved.service
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/sh -c 'while ! /usr/bin/grep '^[^#[:space:]]' /etc/resolv.conf > /dev/null; do sleep 1; done'
[Install]
RequiredBy=kubelet.service
RequiredBy=etcd-member.service
- name: kubelet.service
enabled: true
contents: |
[Unit]
Description=Kubelet (System Container)
Requires=docker.service
After=docker.service
Requires=coreos-metadata.service
After=coreos-metadata.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.21.1
EnvironmentFile=/run/metadata/coreos
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=/usr/bin/docker run -d \
--name kubelet \
--privileged \
--pid host \
--network host \
-v /etc/cni/net.d:/etc/cni/net.d:ro \
-v /etc/kubernetes:/etc/kubernetes:ro \
-v /etc/machine-id:/etc/machine-id:ro \
-v /usr/lib/os-release:/etc/os-release:ro \
-v /lib/modules:/lib/modules:ro \
-v /run:/run \
-v /sys/fs/cgroup:/sys/fs/cgroup:ro \
-v /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
-v /var/lib/calico:/var/lib/calico:ro \
-v /var/lib/docker:/var/lib/docker \
-v /var/lib/kubelet:/var/lib/kubelet:rshared \
-v /var/log:/var/log \
-v /opt/cni/bin:/opt/cni/bin \
$${KUBELET_IMAGE} \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--healthz-port=0 \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--provider-id=aws:///$${COREOS_EC2_AVAILABILITY_ZONE}/$${COREOS_EC2_INSTANCE_ID} \
--read-only-port=0 \
--register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStart=docker logs -f kubelet
ExecStop=docker stop kubelet
ExecStopPost=docker rm kubelet
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
- name: bootstrap.service
contents: |
[Unit]
Description=Kubernetes control plane
Wants=docker.service
After=docker.service
ConditionPathExists=!/opt/bootstrap/bootstrap.done
[Service]
Type=oneshot
RemainAfterExit=true
WorkingDirectory=/opt/bootstrap
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.21.1
ExecStart=/usr/bin/docker run \
-v /etc/kubernetes/pki:/etc/kubernetes/pki:ro \
-v /opt/bootstrap/assets:/assets:ro \
-v /opt/bootstrap/apply:/apply:ro \
--entrypoint=/apply \
$${KUBELET_IMAGE}
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
[Install]
WantedBy=multi-user.target
storage:
directories:
- path: /var/lib/etcd
filesystem: root
mode: 0700
overwrite: true
files:
- path: /etc/kubernetes/kubeconfig
filesystem: root
mode: 0644
contents:
inline: |
${kubeconfig}
- path: /opt/bootstrap/layout
filesystem: root
mode: 0544
contents:
inline: |
#!/bin/bash -e
mkdir -p -- auth tls/etcd tls/k8s static-manifests manifests/coredns manifests-networking
awk '/#####/ {filename=$2; next} {print > filename}' assets
mkdir -p /etc/ssl/etcd/etcd
mkdir -p /etc/kubernetes/pki
mv tls/etcd/{peer*,server*} /etc/ssl/etcd/etcd/
mv tls/etcd/etcd-client* /etc/kubernetes/pki/
chown -R etcd:etcd /etc/ssl/etcd
chmod -R 500 /etc/ssl/etcd
chmod -R 700 /var/lib/etcd
mv auth/* /etc/kubernetes/pki/
mv tls/k8s/* /etc/kubernetes/pki/
mkdir -p /etc/kubernetes/manifests
mv static-manifests/* /etc/kubernetes/manifests/
mkdir -p /opt/bootstrap/assets
mv manifests /opt/bootstrap/assets/manifests
mv manifests-networking/* /opt/bootstrap/assets/manifests/
rm -rf assets auth static-manifests tls manifests-networking
- path: /opt/bootstrap/apply
filesystem: root
mode: 0544
contents:
inline: |
#!/bin/bash -e
export KUBECONFIG=/etc/kubernetes/pki/admin.conf
until kubectl version; do
echo "Waiting for static pod control plane"
sleep 5
done
until kubectl apply -f /assets/manifests -R; do
echo "Retry applying manifests"
sleep 5
done
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
mode: 0644
contents:
inline: |
fs.inotify.max_user_watches=16184
- path: /etc/etcd/etcd.env
filesystem: root
mode: 0644
contents:
inline: |
ETCD_NAME=${etcd_name}
ETCD_DATA_DIR=/var/lib/etcd
ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379
ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380
ETCD_LISTEN_CLIENT_URLS=https://0.0.0.0:2379
ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380
ETCD_LISTEN_METRICS_URLS=http://0.0.0.0:2381
ETCD_INITIAL_CLUSTER=${etcd_initial_cluster}
ETCD_STRICT_RECONFIG_CHECK=true
ETCD_TRUSTED_CA_FILE=/etc/ssl/certs/etcd/server-ca.crt
ETCD_CERT_FILE=/etc/ssl/certs/etcd/server.crt
ETCD_KEY_FILE=/etc/ssl/certs/etcd/server.key
ETCD_CLIENT_CERT_AUTH=true
ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/certs/etcd/peer-ca.crt
ETCD_PEER_CERT_FILE=/etc/ssl/certs/etcd/peer.crt
ETCD_PEER_KEY_FILE=/etc/ssl/certs/etcd/peer.key
ETCD_PEER_CLIENT_CERT_AUTH=true
passwd:
users:
- name: core
ssh_authorized_keys:
- "${ssh_authorized_key}"

View File

@ -67,7 +67,6 @@ data "template_file" "controller-configs" {
etcd_domain = "${var.cluster_name}-etcd${count.index}.${var.dns_zone}" etcd_domain = "${var.cluster_name}-etcd${count.index}.${var.dns_zone}"
# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,... # etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...
etcd_initial_cluster = join(",", data.template_file.etcds.*.rendered) etcd_initial_cluster = join(",", data.template_file.etcds.*.rendered)
cgroup_driver = local.flavor == "flatcar" && local.channel == "edge" ? "systemd" : "cgroupfs"
kubeconfig = indent(10, module.bootstrap.kubeconfig-kubelet) kubeconfig = indent(10, module.bootstrap.kubeconfig-kubelet)
ssh_authorized_key = var.ssh_authorized_key ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10) cluster_dns_service_ip = cidrhost(var.service_cidr, 10)

View File

@ -17,6 +17,7 @@ resource "aws_route53_record" "apiserver" {
resource "aws_lb" "nlb" { resource "aws_lb" "nlb" {
name = "${var.cluster_name}-nlb" name = "${var.cluster_name}-nlb"
load_balancer_type = "network" load_balancer_type = "network"
ip_address_type = "dualstack"
internal = false internal = false
subnets = aws_subnet.public.*.id subnets = aws_subnet.public.*.id

View File

@ -1,5 +1,6 @@
output "kubeconfig-admin" { output "kubeconfig-admin" {
value = module.bootstrap.kubeconfig-admin value = module.bootstrap.kubeconfig-admin
sensitive = true
} }
# Outputs for Kubernetes Ingress # Outputs for Kubernetes Ingress
@ -32,7 +33,8 @@ output "worker_security_groups" {
} }
output "kubeconfig" { output "kubeconfig" {
value = module.bootstrap.kubeconfig-kubelet value = module.bootstrap.kubeconfig-kubelet
sensitive = true
} }
# Outputs for custom load balancing # Outputs for custom load balancing
@ -52,3 +54,10 @@ output "worker_target_group_https" {
value = module.workers.target_group_https value = module.workers.target_group_https
} }
# Outputs for debug
output "assets_dist" {
value = module.bootstrap.assets_dist
sensitive = true
}

View File

@ -43,14 +43,19 @@ variable "worker_type" {
variable "os_image" { variable "os_image" {
type = string type = string
description = "AMI channel for a Container Linux derivative (coreos-stable, coreos-beta, coreos-alpha, flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge)" description = "AMI channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha)"
default = "flatcar-stable" default = "flatcar-stable"
validation {
condition = contains(["flatcar-stable", "flatcar-beta", "flatcar-alpha"], var.os_image)
error_message = "The os_image must be flatcar-stable, flatcar-beta, or flatcar-alpha."
}
} }
variable "disk_size" { variable "disk_size" {
type = number type = number
description = "Size of the EBS volume in GB" description = "Size of the EBS volume in GB"
default = 40 default = 30
} }
variable "disk_type" { variable "disk_type" {
@ -149,15 +154,14 @@ variable "worker_node_labels" {
# unofficial, undocumented, unsupported # unofficial, undocumented, unsupported
variable "asset_dir" {
type = string
description = "Absolute path to a directory where generated assets should be placed (contains secrets)"
default = ""
}
variable "cluster_domain_suffix" { variable "cluster_domain_suffix" {
type = string type = string
description = "Queries for domains with the suffix will be answered by CoreDNS. Default is cluster.local (e.g. foo.default.svc.cluster.local)" description = "Queries for domains with the suffix will be answered by CoreDNS. Default is cluster.local (e.g. foo.default.svc.cluster.local)"
default = "cluster.local" default = "cluster.local"
} }
variable "daemonset_tolerations" {
type = list(string)
description = "List of additional taint keys kube-system DaemonSets should tolerate (e.g. ['custom-role', 'gpu-role'])"
default = []
}

View File

@ -1,7 +1,7 @@
# Terraform version and plugin versions # Terraform version and plugin versions
terraform { terraform {
required_version = ">= 0.12.26, < 0.14.0" required_version = ">= 0.13.0, < 0.16.0"
required_providers { required_providers {
aws = ">= 2.23, <= 4.0" aws = ">= 2.23, <= 4.0"
template = "~> 2.1" template = "~> 2.1"
@ -9,7 +9,7 @@ terraform {
ct = { ct = {
source = "poseidon/ct" source = "poseidon/ct"
version = "~> 0.6.1" version = "~> 0.8"
} }
} }
} }

View File

@ -0,0 +1,27 @@
locals {
# Pick a Flatcar Linux AMI
# flatcar-stable -> Flatcar Linux AMI
ami_id = data.aws_ami.flatcar.image_id
channel = split("-", var.os_image)[1]
}
data "aws_ami" "flatcar" {
most_recent = true
owners = ["075585003325"]
filter {
name = "architecture"
values = ["x86_64"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
filter {
name = "name"
values = ["Flatcar-${local.channel}-*"]
}
}

View File

@ -0,0 +1,122 @@
---
systemd:
units:
- name: docker.service
enabled: true
- name: locksmithd.service
mask: true
- name: wait-for-dns.service
enabled: true
contents: |
[Unit]
Description=Wait for DNS entries
Wants=systemd-resolved.service
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/sh -c 'while ! /usr/bin/grep '^[^#[:space:]]' /etc/resolv.conf > /dev/null; do sleep 1; done'
[Install]
RequiredBy=kubelet.service
- name: kubelet.service
enabled: true
contents: |
[Unit]
Description=Kubelet
Requires=docker.service
After=docker.service
Requires=coreos-metadata.service
After=coreos-metadata.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.21.1
EnvironmentFile=/run/metadata/coreos
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
# Podman, rkt, or runc run container processes, whereas docker run
# is a client to a daemon and requires workarounds to use within a
# systemd unit. https://github.com/moby/moby/issues/6791
ExecStartPre=/usr/bin/docker run -d \
--name kubelet \
--privileged \
--pid host \
--network host \
-v /etc/cni/net.d:/etc/cni/net.d:ro \
-v /etc/kubernetes:/etc/kubernetes:ro \
-v /etc/machine-id:/etc/machine-id:ro \
-v /usr/lib/os-release:/etc/os-release:ro \
-v /lib/modules:/lib/modules:ro \
-v /run:/run \
-v /sys/fs/cgroup:/sys/fs/cgroup:ro \
-v /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
-v /var/lib/calico:/var/lib/calico:ro \
-v /var/lib/docker:/var/lib/docker \
-v /var/lib/kubelet:/var/lib/kubelet:rshared \
-v /var/log:/var/log \
-v /opt/cni/bin:/opt/cni/bin \
$${KUBELET_IMAGE} \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--healthz-port=0 \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/node \
%{~ for label in split(",", node_labels) ~}
--node-labels=${label} \
%{~ endfor ~}
%{~ for taint in split(",", node_taints) ~}
--register-with-taints=${taint} \
%{~ endfor ~}
--pod-manifest-path=/etc/kubernetes/manifests \
--provider-id=aws:///$${COREOS_EC2_AVAILABILITY_ZONE}/$${COREOS_EC2_INSTANCE_ID} \
--read-only-port=0 \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStart=docker logs -f kubelet
ExecStop=docker stop kubelet
ExecStopPost=docker rm kubelet
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
- name: delete-node.service
enabled: true
contents: |
[Unit]
Description=Delete Kubernetes node on shutdown
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.21.1
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/true
ExecStop=/bin/bash -c '/usr/bin/docker run -v /var/lib/kubelet:/var/lib/kubelet:ro --entrypoint /usr/local/bin/kubectl $${KUBELET_IMAGE} --kubeconfig=/var/lib/kubelet/kubeconfig delete node $HOSTNAME'
[Install]
WantedBy=multi-user.target
storage:
files:
- path: /etc/kubernetes/kubeconfig
filesystem: root
mode: 0644
contents:
inline: |
${kubeconfig}
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
mode: 0644
contents:
inline: |
fs.inotify.max_user_watches=16184
passwd:
users:
- name: core
ssh_authorized_keys:
- "${ssh_authorized_key}"

View File

@ -36,14 +36,19 @@ variable "instance_type" {
variable "os_image" { variable "os_image" {
type = string type = string
description = "AMI channel for a Container Linux derivative (coreos-stable, coreos-beta, coreos-alpha, flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge)" description = "AMI channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha)"
default = "flatcar-stable" default = "flatcar-stable"
validation {
condition = contains(["flatcar-stable", "flatcar-beta", "flatcar-alpha"], var.os_image)
error_message = "The os_image must be flatcar-stable, flatcar-beta, or flatcar-alpha."
}
} }
variable "disk_size" { variable "disk_size" {
type = number type = number
description = "Size of the EBS volume in GB" description = "Size of the EBS volume in GB"
default = 40 default = 30
} }
variable "disk_type" { variable "disk_type" {
@ -108,3 +113,9 @@ variable "node_labels" {
description = "List of initial node labels" description = "List of initial node labels"
default = [] default = []
} }
variable "node_taints" {
type = list(string)
description = "List of initial node taints"
default = []
}

View File

@ -1,14 +1,14 @@
# Terraform version and plugin versions # Terraform version and plugin versions
terraform { terraform {
required_version = ">= 0.12.26, < 0.14.0" required_version = ">= 0.13.0, < 0.16.0"
required_providers { required_providers {
aws = ">= 2.23, <= 4.0" aws = ">= 2.23, <= 4.0"
template = "~> 2.1" template = "~> 2.1"
ct = { ct = {
source = "poseidon/ct" source = "poseidon/ct"
version = "~> 0.6.1" version = "~> 0.8"
} }
} }
} }

View File

@ -85,8 +85,8 @@ data "template_file" "worker-config" {
ssh_authorized_key = var.ssh_authorized_key ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10) cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix cluster_domain_suffix = var.cluster_domain_suffix
cgroup_driver = local.flavor == "flatcar" && local.channel == "edge" ? "systemd" : "cgroupfs"
node_labels = join(",", var.node_labels) node_labels = join(",", var.node_labels)
node_taints = join(",", var.node_taints)
} }
} }

View File

@ -1,205 +0,0 @@
---
systemd:
units:
- name: etcd-member.service
enabled: true
dropins:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.4.12"
Environment="ETCD_IMAGE_URL=docker://quay.io/coreos/etcd"
Environment="RKT_RUN_ARGS=--insecure-options=image"
Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
Environment="ETCD_LISTEN_CLIENT_URLS=https://0.0.0.0:2379"
Environment="ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380"
Environment="ETCD_LISTEN_METRICS_URLS=http://0.0.0.0:2381"
Environment="ETCD_INITIAL_CLUSTER=${etcd_initial_cluster}"
Environment="ETCD_STRICT_RECONFIG_CHECK=true"
Environment="ETCD_SSL_DIR=/etc/ssl/etcd"
Environment="ETCD_TRUSTED_CA_FILE=/etc/ssl/certs/etcd/server-ca.crt"
Environment="ETCD_CERT_FILE=/etc/ssl/certs/etcd/server.crt"
Environment="ETCD_KEY_FILE=/etc/ssl/certs/etcd/server.key"
Environment="ETCD_CLIENT_CERT_AUTH=true"
Environment="ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/certs/etcd/peer-ca.crt"
Environment="ETCD_PEER_CERT_FILE=/etc/ssl/certs/etcd/peer.crt"
Environment="ETCD_PEER_KEY_FILE=/etc/ssl/certs/etcd/peer.key"
Environment="ETCD_PEER_CLIENT_CERT_AUTH=true"
- name: docker.service
enabled: true
- name: locksmithd.service
mask: true
- name: wait-for-dns.service
enabled: true
contents: |
[Unit]
Description=Wait for DNS entries
Wants=systemd-resolved.service
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/sh -c 'while ! /usr/bin/grep '^[^#[:space:]]' /etc/resolv.conf > /dev/null; do sleep 1; done'
[Install]
RequiredBy=kubelet.service
RequiredBy=etcd-member.service
- name: kubelet.service
enabled: true
contents: |
[Unit]
Description=Kubelet
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.19.2
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/bin/rkt run \
--uuid-file-save=/var/cache/kubelet-pod.uuid \
--stage1-from-dir=stage1-fly.aci \
--hosts-entry host \
--insecure-options=image \
--volume etc-kubernetes,kind=host,source=/etc/kubernetes,readOnly=true \
--mount volume=etc-kubernetes,target=/etc/kubernetes \
--volume etc-machine-id,kind=host,source=/etc/machine-id,readOnly=true \
--mount volume=etc-machine-id,target=/etc/machine-id \
--volume etc-os-release,kind=host,source=/usr/lib/os-release,readOnly=true \
--mount volume=etc-os-release,target=/etc/os-release \
--volume=etc-resolv,kind=host,source=/etc/resolv.conf,readOnly=true \
--mount volume=etc-resolv,target=/etc/resolv.conf \
--volume etc-ssl-certs,kind=host,source=/etc/ssl/certs,readOnly=true \
--mount volume=etc-ssl-certs,target=/etc/ssl/certs \
--volume lib-modules,kind=host,source=/lib/modules,readOnly=true \
--mount volume=lib-modules,target=/lib/modules \
--volume run,kind=host,source=/run \
--mount volume=run,target=/run \
--volume usr-share-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true \
--mount volume=usr-share-certs,target=/usr/share/ca-certificates \
--volume var-lib-calico,kind=host,source=/var/lib/calico,readOnly=true \
--mount volume=var-lib-calico,target=/var/lib/calico \
--volume var-lib-docker,kind=host,source=/var/lib/docker \
--mount volume=var-lib-docker,target=/var/lib/docker \
--volume var-lib-kubelet,kind=host,source=/var/lib/kubelet,recursive=true \
--mount volume=var-lib-kubelet,target=/var/lib/kubelet \
--volume var-log,kind=host,source=/var/log \
--mount volume=var-log,target=/var/log \
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
--mount volume=opt-cni-bin,target=/opt/cni/bin \
$${KUBELET_IMAGE} -- \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
- name: bootstrap.service
contents: |
[Unit]
Description=Kubernetes control plane
ConditionPathExists=!/opt/bootstrap/bootstrap.done
[Service]
Type=oneshot
RemainAfterExit=true
WorkingDirectory=/opt/bootstrap
ExecStart=/usr/bin/rkt run \
--trust-keys-from-https \
--volume config,kind=host,source=/etc/kubernetes/bootstrap-secrets \
--mount volume=config,target=/etc/kubernetes/secrets \
--volume assets,kind=host,source=/opt/bootstrap/assets \
--mount volume=assets,target=/assets \
--volume script,kind=host,source=/opt/bootstrap/apply \
--mount volume=script,target=/apply \
--insecure-options=image \
docker://quay.io/poseidon/kubelet:v1.19.2 \
--net=host \
--dns=host \
--exec=/apply
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
[Install]
WantedBy=multi-user.target
storage:
directories:
- path: /var/lib/etcd
filesystem: root
mode: 0700
overwrite: true
files:
- path: /etc/kubernetes/kubeconfig
filesystem: root
mode: 0644
contents:
inline: |
${kubeconfig}
- path: /opt/bootstrap/layout
filesystem: root
mode: 0544
contents:
inline: |
#!/bin/bash -e
mkdir -p -- auth tls/etcd tls/k8s static-manifests manifests/coredns manifests-networking
awk '/#####/ {filename=$2; next} {print > filename}' assets
mkdir -p /etc/ssl/etcd/etcd
mkdir -p /etc/kubernetes/bootstrap-secrets
mv tls/etcd/{peer*,server*} /etc/ssl/etcd/etcd/
mv tls/etcd/etcd-client* /etc/kubernetes/bootstrap-secrets/
chown -R etcd:etcd /etc/ssl/etcd
chmod -R 500 /etc/ssl/etcd
chmod -R 700 /var/lib/etcd
mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/
mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/
mkdir -p /etc/kubernetes/manifests
mv static-manifests/* /etc/kubernetes/manifests/
mkdir -p /opt/bootstrap/assets
mv manifests /opt/bootstrap/assets/manifests
mv manifests-networking/* /opt/bootstrap/assets/manifests/
rm -rf assets auth static-manifests tls manifests-networking
- path: /opt/bootstrap/apply
filesystem: root
mode: 0544
contents:
inline: |
#!/bin/bash -e
export KUBECONFIG=/etc/kubernetes/secrets/kubeconfig
until kubectl version; do
echo "Waiting for static pod control plane"
sleep 5
done
until kubectl apply -f /assets/manifests -R; do
echo "Retry applying manifests"
sleep 5
done
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
mode: 0644
contents:
inline: |
fs.inotify.max_user_watches=16184
passwd:
users:
- name: core
ssh_authorized_keys:
- "${ssh_authorized_key}"

View File

@ -1,140 +0,0 @@
---
systemd:
units:
- name: docker.service
enabled: true
- name: locksmithd.service
mask: true
- name: wait-for-dns.service
enabled: true
contents: |
[Unit]
Description=Wait for DNS entries
Wants=systemd-resolved.service
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/sh -c 'while ! /usr/bin/grep '^[^#[:space:]]' /etc/resolv.conf > /dev/null; do sleep 1; done'
[Install]
RequiredBy=kubelet.service
- name: kubelet.service
enabled: true
contents: |
[Unit]
Description=Kubelet
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.19.2
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/bin/rkt run \
--uuid-file-save=/var/cache/kubelet-pod.uuid \
--stage1-from-dir=stage1-fly.aci \
--hosts-entry host \
--insecure-options=image \
--volume etc-kubernetes,kind=host,source=/etc/kubernetes,readOnly=true \
--mount volume=etc-kubernetes,target=/etc/kubernetes \
--volume etc-machine-id,kind=host,source=/etc/machine-id,readOnly=true \
--mount volume=etc-machine-id,target=/etc/machine-id \
--volume etc-os-release,kind=host,source=/usr/lib/os-release,readOnly=true \
--mount volume=etc-os-release,target=/etc/os-release \
--volume=etc-resolv,kind=host,source=/etc/resolv.conf,readOnly=true \
--mount volume=etc-resolv,target=/etc/resolv.conf \
--volume etc-ssl-certs,kind=host,source=/etc/ssl/certs,readOnly=true \
--mount volume=etc-ssl-certs,target=/etc/ssl/certs \
--volume lib-modules,kind=host,source=/lib/modules,readOnly=true \
--mount volume=lib-modules,target=/lib/modules \
--volume run,kind=host,source=/run \
--mount volume=run,target=/run \
--volume usr-share-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true \
--mount volume=usr-share-certs,target=/usr/share/ca-certificates \
--volume var-lib-calico,kind=host,source=/var/lib/calico,readOnly=true \
--mount volume=var-lib-calico,target=/var/lib/calico \
--volume var-lib-docker,kind=host,source=/var/lib/docker \
--mount volume=var-lib-docker,target=/var/lib/docker \
--volume var-lib-kubelet,kind=host,source=/var/lib/kubelet,recursive=true \
--mount volume=var-lib-kubelet,target=/var/lib/kubelet \
--volume var-log,kind=host,source=/var/log \
--mount volume=var-log,target=/var/log \
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
--mount volume=opt-cni-bin,target=/opt/cni/bin \
$${KUBELET_IMAGE} -- \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/node \
%{~ for label in split(",", node_labels) ~}
--node-labels=${label} \
%{~ endfor ~}
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
- name: delete-node.service
enabled: true
contents: |
[Unit]
Description=Waiting to delete Kubernetes node on shutdown
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/true
ExecStop=/etc/kubernetes/delete-node
[Install]
WantedBy=multi-user.target
storage:
files:
- path: /etc/kubernetes/kubeconfig
filesystem: root
mode: 0644
contents:
inline: |
${kubeconfig}
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
mode: 0644
contents:
inline: |
fs.inotify.max_user_watches=16184
- path: /etc/kubernetes/delete-node
filesystem: root
mode: 0744
contents:
inline: |
#!/bin/bash
set -e
exec /usr/bin/rkt run \
--trust-keys-from-https \
--volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \
--insecure-options=image \
docker://quay.io/poseidon/kubelet:v1.19.2 \
--net=host \
--dns=host \
--exec=/usr/local/bin/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname | tr '[:upper:]' '[:lower:]')
passwd:
users:
- name: core
ssh_authorized_keys:
- "${ssh_authorized_key}"

View File

@ -11,10 +11,10 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a> ## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.19.2 (upstream) * Kubernetes v1.21.1 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking * Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot priority](https://typhoon.psdn.io/fedora-coreos/azure/#low-priority) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/) customization * Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot priority](https://typhoon.psdn.io/fedora-coreos/azure/#low-priority) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/) * Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
## Docs ## Docs

View File

@ -1,11 +1,10 @@
# Kubernetes assets (kubeconfig, manifests) # Kubernetes assets (kubeconfig, manifests)
module "bootstrap" { module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d0f2123c5971410dc14aecde2307eb13e89c2bdf" source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=ebe3d5526a59b34c8f119a206358b0c0a6f6f67d"
cluster_name = var.cluster_name cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)] api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
etcd_servers = formatlist("%s.%s", azurerm_dns_a_record.etcds.*.name, var.dns_zone) etcd_servers = formatlist("%s.%s", azurerm_dns_a_record.etcds.*.name, var.dns_zone)
asset_dir = var.asset_dir
networking = var.networking networking = var.networking
@ -19,6 +18,7 @@ module "bootstrap" {
cluster_domain_suffix = var.cluster_domain_suffix cluster_domain_suffix = var.cluster_domain_suffix
enable_reporting = var.enable_reporting enable_reporting = var.enable_reporting
enable_aggregation = var.enable_aggregation enable_aggregation = var.enable_aggregation
daemonset_tolerations = var.daemonset_tolerations
# Fedora CoreOS # Fedora CoreOS
trusted_certs_dir = "/etc/pki/tls/certs" trusted_certs_dir = "/etc/pki/tls/certs"

View File

@ -1,6 +1,6 @@
--- ---
variant: fcos variant: fcos
version: 1.1.0 version: 1.2.0
systemd: systemd:
units: units:
- name: etcd-member.service - name: etcd-member.service
@ -8,28 +8,25 @@ systemd:
contents: | contents: |
[Unit] [Unit]
Description=etcd (System Container) Description=etcd (System Container)
Documentation=https://github.com/coreos/etcd Documentation=https://github.com/etcd-io/etcd
Wants=network-online.target network.target Wants=network-online.target network.target
After=network-online.target After=network-online.target
[Service] [Service]
# https://github.com/opencontainers/runc/pull/1807 Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.4.16
# Type=notify
# NotifyAccess=exec
Type=exec Type=exec
Restart=on-failure
RestartSec=10s
TimeoutStartSec=0
LimitNOFILE=40000
ExecStartPre=/bin/mkdir -p /var/lib/etcd ExecStartPre=/bin/mkdir -p /var/lib/etcd
ExecStartPre=-/usr/bin/podman rm etcd ExecStartPre=-/usr/bin/podman rm etcd
#--volume $${NOTIFY_SOCKET}:/run/systemd/notify \
ExecStart=/usr/bin/podman run --name etcd \ ExecStart=/usr/bin/podman run --name etcd \
--env-file /etc/etcd/etcd.env \ --env-file /etc/etcd/etcd.env \
--network host \ --network host \
--volume /var/lib/etcd:/var/lib/etcd:rw,Z \ --volume /var/lib/etcd:/var/lib/etcd:rw,Z \
--volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \ --volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \
quay.io/coreos/etcd:v3.4.12 $${ETCD_IMAGE}
ExecStop=/usr/bin/podman stop etcd ExecStop=/usr/bin/podman stop etcd
Restart=on-failure
RestartSec=10s
TimeoutStartSec=0
LimitNOFILE=40000
[Install] [Install]
WantedBy=multi-user.target WantedBy=multi-user.target
- name: docker.service - name: docker.service
@ -54,8 +51,8 @@ systemd:
Description=Kubelet (System Container) Description=Kubelet (System Container)
Wants=rpc-statd.service Wants=rpc-statd.service
[Service] [Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.19.2 Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.21.1
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico ExecStartPre=/bin/mkdir -p /var/lib/calico
@ -66,14 +63,12 @@ systemd:
--privileged \ --privileged \
--pid host \ --pid host \
--network host \ --network host \
--volume /etc/cni/net.d:/etc/cni/net.d:ro,z \
--volume /etc/kubernetes:/etc/kubernetes:ro,z \ --volume /etc/kubernetes:/etc/kubernetes:ro,z \
--volume /usr/lib/os-release:/etc/os-release:ro \ --volume /usr/lib/os-release:/etc/os-release:ro \
--volume /etc/ssl/certs:/etc/ssl/certs:ro \
--volume /lib/modules:/lib/modules:ro \ --volume /lib/modules:/lib/modules:ro \
--volume /run:/run \ --volume /run:/run \
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \ --volume /sys/fs/cgroup:/sys/fs/cgroup \
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
--volume /etc/pki/tls/certs:/usr/share/ca-certificates:ro \
--volume /var/lib/calico:/var/lib/calico:ro \ --volume /var/lib/calico:/var/lib/calico:ro \
--volume /var/lib/docker:/var/lib/docker \ --volume /var/lib/docker:/var/lib/docker \
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \ --volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
@ -91,7 +86,6 @@ systemd:
--client-ca-file=/etc/kubernetes/ca.crt \ --client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \ --cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \ --cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \ --healthz-port=0 \
--kubeconfig=/var/lib/kubelet/kubeconfig \ --kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \ --network-plugin=cni \
@ -119,11 +113,11 @@ systemd:
ExecStartPre=-/usr/bin/podman rm bootstrap ExecStartPre=-/usr/bin/podman rm bootstrap
ExecStart=/usr/bin/podman run --name bootstrap \ ExecStart=/usr/bin/podman run --name bootstrap \
--network host \ --network host \
--volume /etc/kubernetes/bootstrap-secrets:/etc/kubernetes/secrets:ro,z \ --volume /etc/kubernetes/pki:/etc/kubernetes/pki:ro,z \
--volume /opt/bootstrap/assets:/assets:ro,Z \ --volume /opt/bootstrap/assets:/assets:ro,Z \
--volume /opt/bootstrap/apply:/apply:ro,Z \ --volume /opt/bootstrap/apply:/apply:ro,Z \
--entrypoint=/apply \ --entrypoint=/apply \
quay.io/poseidon/kubelet:v1.19.2 quay.io/poseidon/kubelet:v1.21.1
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
ExecStartPost=-/usr/bin/podman stop bootstrap ExecStartPost=-/usr/bin/podman stop bootstrap
storage: storage:
@ -146,26 +140,26 @@ storage:
mkdir -p -- auth tls/etcd tls/k8s static-manifests manifests/coredns manifests-networking mkdir -p -- auth tls/etcd tls/k8s static-manifests manifests/coredns manifests-networking
awk '/#####/ {filename=$2; next} {print > filename}' assets awk '/#####/ {filename=$2; next} {print > filename}' assets
mkdir -p /etc/ssl/etcd/etcd mkdir -p /etc/ssl/etcd/etcd
mkdir -p /etc/kubernetes/bootstrap-secrets mkdir -p /etc/kubernetes/pki
mv tls/etcd/{peer*,server*} /etc/ssl/etcd/etcd/ mv tls/etcd/{peer*,server*} /etc/ssl/etcd/etcd/
mv tls/etcd/etcd-client* /etc/kubernetes/bootstrap-secrets/ mv tls/etcd/etcd-client* /etc/kubernetes/pki/
chown -R etcd:etcd /etc/ssl/etcd chown -R etcd:etcd /etc/ssl/etcd
chmod -R 500 /etc/ssl/etcd chmod -R 500 /etc/ssl/etcd
mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/ mv auth/* /etc/kubernetes/pki/
mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/ mv tls/k8s/* /etc/kubernetes/pki/
mkdir -p /etc/kubernetes/manifests mkdir -p /etc/kubernetes/manifests
mv static-manifests/* /etc/kubernetes/manifests/ mv static-manifests/* /etc/kubernetes/manifests/
mkdir -p /opt/bootstrap/assets mkdir -p /opt/bootstrap/assets
mv manifests /opt/bootstrap/assets/manifests mv manifests /opt/bootstrap/assets/manifests
mv manifests-networking/* /opt/bootstrap/assets/manifests/ mv manifests-networking/* /opt/bootstrap/assets/manifests/
rm -rf assets auth static-manifests tls manifests-networking rm -rf assets auth static-manifests tls manifests-networking
chcon -R -u system_u -t container_file_t /etc/kubernetes/bootstrap-secrets chcon -R -u system_u -t container_file_t /etc/kubernetes/pki
- path: /opt/bootstrap/apply - path: /opt/bootstrap/apply
mode: 0544 mode: 0544
contents: contents:
inline: | inline: |
#!/bin/bash -e #!/bin/bash -e
export KUBECONFIG=/etc/kubernetes/secrets/kubeconfig export KUBECONFIG=/etc/kubernetes/pki/admin.conf
until kubectl version; do until kubectl version; do
echo "Waiting for static pod control plane" echo "Waiting for static pod control plane"
sleep 5 sleep 5
@ -201,8 +195,6 @@ storage:
mode: 0644 mode: 0644
contents: contents:
inline: | inline: |
# TODO: Use a systemd dropin once podman v1.4.5 is avail.
NOTIFY_SOCKET=/run/systemd/notify
ETCD_NAME=${etcd_name} ETCD_NAME=${etcd_name}
ETCD_DATA_DIR=/var/lib/etcd ETCD_DATA_DIR=/var/lib/etcd
ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379 ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379

View File

@ -112,16 +112,12 @@ resource "azurerm_lb_outbound_rule" "worker-outbound" {
# Address pool of controllers # Address pool of controllers
resource "azurerm_lb_backend_address_pool" "controller" { resource "azurerm_lb_backend_address_pool" "controller" {
resource_group_name = azurerm_resource_group.cluster.name
name = "controller" name = "controller"
loadbalancer_id = azurerm_lb.cluster.id loadbalancer_id = azurerm_lb.cluster.id
} }
# Address pool of workers # Address pool of workers
resource "azurerm_lb_backend_address_pool" "worker" { resource "azurerm_lb_backend_address_pool" "worker" {
resource_group_name = azurerm_resource_group.cluster.name
name = "worker" name = "worker"
loadbalancer_id = azurerm_lb.cluster.id loadbalancer_id = azurerm_lb.cluster.id
} }

View File

@ -1,5 +1,6 @@
output "kubeconfig-admin" { output "kubeconfig-admin" {
value = module.bootstrap.kubeconfig-admin value = module.bootstrap.kubeconfig-admin
sensitive = true
} }
# Outputs for Kubernetes Ingress # Outputs for Kubernetes Ingress
@ -32,7 +33,8 @@ output "security_group_id" {
} }
output "kubeconfig" { output "kubeconfig" {
value = module.bootstrap.kubeconfig-kubelet value = module.bootstrap.kubeconfig-kubelet
sensitive = true
} }
# Outputs for custom firewalling # Outputs for custom firewalling
@ -57,3 +59,11 @@ output "backend_address_pool_id" {
description = "ID of the worker backend address pool" description = "ID of the worker backend address pool"
value = azurerm_lb_backend_address_pool.worker.id value = azurerm_lb_backend_address_pool.worker.id
} }
# Outputs for debug
output "assets_dist" {
value = module.bootstrap.assets_dist
sensitive = true
}

View File

@ -54,7 +54,7 @@ variable "os_image" {
variable "disk_size" { variable "disk_size" {
type = number type = number
description = "Size of the disk in GB" description = "Size of the disk in GB"
default = 40 default = 30
} }
variable "worker_priority" { variable "worker_priority" {
@ -129,15 +129,14 @@ variable "worker_node_labels" {
# unofficial, undocumented, unsupported # unofficial, undocumented, unsupported
variable "asset_dir" {
type = string
description = "Absolute path to a directory where generated assets should be placed (contains secrets)"
default = ""
}
variable "cluster_domain_suffix" { variable "cluster_domain_suffix" {
type = string type = string
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) " description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
default = "cluster.local" default = "cluster.local"
} }
variable "daemonset_tolerations" {
type = list(string)
description = "List of additional taint keys kube-system DaemonSets should tolerate (e.g. ['custom-role', 'gpu-role'])"
default = []
}

View File

@ -1,7 +1,7 @@
# Terraform version and plugin versions # Terraform version and plugin versions
terraform { terraform {
required_version = ">= 0.12.26, < 0.14.0" required_version = ">= 0.13.0, < 0.16.0"
required_providers { required_providers {
azurerm = "~> 2.8" azurerm = "~> 2.8"
template = "~> 2.1" template = "~> 2.1"
@ -9,7 +9,7 @@ terraform {
ct = { ct = {
source = "poseidon/ct" source = "poseidon/ct"
version = "~> 0.6.1" version = "~> 0.8"
} }
} }
} }

View File

@ -1,6 +1,6 @@
--- ---
variant: fcos variant: fcos
version: 1.1.0 version: 1.2.0
systemd: systemd:
units: units:
- name: docker.service - name: docker.service
@ -24,8 +24,8 @@ systemd:
Description=Kubelet (System Container) Description=Kubelet (System Container)
Wants=rpc-statd.service Wants=rpc-statd.service
[Service] [Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.19.2 Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.21.1
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico ExecStartPre=/bin/mkdir -p /var/lib/calico
@ -36,14 +36,12 @@ systemd:
--privileged \ --privileged \
--pid host \ --pid host \
--network host \ --network host \
--volume /etc/cni/net.d:/etc/cni/net.d:ro,z \
--volume /etc/kubernetes:/etc/kubernetes:ro,z \ --volume /etc/kubernetes:/etc/kubernetes:ro,z \
--volume /usr/lib/os-release:/etc/os-release:ro \ --volume /usr/lib/os-release:/etc/os-release:ro \
--volume /etc/ssl/certs:/etc/ssl/certs:ro \
--volume /lib/modules:/lib/modules:ro \ --volume /lib/modules:/lib/modules:ro \
--volume /run:/run \ --volume /run:/run \
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \ --volume /sys/fs/cgroup:/sys/fs/cgroup \
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
--volume /etc/pki/tls/certs:/usr/share/ca-certificates:ro \
--volume /var/lib/calico:/var/lib/calico:ro \ --volume /var/lib/calico:/var/lib/calico:ro \
--volume /var/lib/docker:/var/lib/docker \ --volume /var/lib/docker:/var/lib/docker \
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \ --volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
@ -61,7 +59,6 @@ systemd:
--client-ca-file=/etc/kubernetes/ca.crt \ --client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \ --cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \ --cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \ --healthz-port=0 \
--kubeconfig=/var/lib/kubelet/kubeconfig \ --kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \ --network-plugin=cni \
@ -69,6 +66,9 @@ systemd:
%{~ for label in split(",", node_labels) ~} %{~ for label in split(",", node_labels) ~}
--node-labels=${label} \ --node-labels=${label} \
%{~ endfor ~} %{~ endfor ~}
%{~ for taint in split(",", node_taints) ~}
--register-with-taints=${taint} \
%{~ endfor ~}
--pod-manifest-path=/etc/kubernetes/manifests \ --pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \ --read-only-port=0 \
--rotate-certificates \ --rotate-certificates \
@ -85,10 +85,11 @@ systemd:
[Unit] [Unit]
Description=Delete Kubernetes node on shutdown Description=Delete Kubernetes node on shutdown
[Service] [Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.21.1
Type=oneshot Type=oneshot
RemainAfterExit=true RemainAfterExit=true
ExecStart=/bin/true ExecStart=/bin/true
ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.19.2 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME' ExecStop=/bin/bash -c '/usr/bin/podman run --volume /var/lib/kubelet:/var/lib/kubelet:ro,z --entrypoint /usr/local/bin/kubectl $${KUBELET_IMAGE} --kubeconfig=/var/lib/kubelet/kubeconfig delete node $HOSTNAME'
[Install] [Install]
WantedBy=multi-user.target WantedBy=multi-user.target
storage: storage:

View File

@ -88,6 +88,12 @@ variable "node_labels" {
default = [] default = []
} }
variable "node_taints" {
type = list(string)
description = "List of initial node taints"
default = []
}
# unofficial, undocumented, unsupported # unofficial, undocumented, unsupported
variable "cluster_domain_suffix" { variable "cluster_domain_suffix" {

View File

@ -1,14 +1,14 @@
# Terraform version and plugin versions # Terraform version and plugin versions
terraform { terraform {
required_version = ">= 0.12.26, < 0.14.0" required_version = ">= 0.13.0, < 0.16.0"
required_providers { required_providers {
azurerm = "~> 2.8" azurerm = "~> 2.8"
template = "~> 2.1" template = "~> 2.1"
ct = { ct = {
source = "poseidon/ct" source = "poseidon/ct"
version = "~> 0.6.1" version = "~> 0.8"
} }
} }
} }

View File

@ -87,6 +87,7 @@ data "template_file" "worker-config" {
cluster_dns_service_ip = cidrhost(var.service_cidr, 10) cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix cluster_domain_suffix = var.cluster_domain_suffix
node_labels = join(",", var.node_labels) node_labels = join(",", var.node_labels)
node_taints = join(",", var.node_taints)
} }
} }

View File

@ -11,13 +11,13 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a> ## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.19.2 (upstream) * Kubernetes v1.21.1 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking * Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [low-priority](https://typhoon.psdn.io/cl/azure/#low-priority) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization * Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [low-priority](https://typhoon.psdn.io/flatcar-linux/azure/#low-priority) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/) * Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
## Docs ## Docs
Please see the [official docs](https://typhoon.psdn.io) and the Azure [tutorial](https://typhoon.psdn.io/cl/azure/). Please see the [official docs](https://typhoon.psdn.io) and the Azure [tutorial](https://typhoon.psdn.io/flatcar-linux/azure/).

View File

@ -1,11 +1,10 @@
# Kubernetes assets (kubeconfig, manifests) # Kubernetes assets (kubeconfig, manifests)
module "bootstrap" { module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d0f2123c5971410dc14aecde2307eb13e89c2bdf" source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=ebe3d5526a59b34c8f119a206358b0c0a6f6f67d"
cluster_name = var.cluster_name cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)] api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
etcd_servers = formatlist("%s.%s", azurerm_dns_a_record.etcds.*.name, var.dns_zone) etcd_servers = formatlist("%s.%s", azurerm_dns_a_record.etcds.*.name, var.dns_zone)
asset_dir = var.asset_dir
networking = var.networking networking = var.networking
@ -19,5 +18,6 @@ module "bootstrap" {
cluster_domain_suffix = var.cluster_domain_suffix cluster_domain_suffix = var.cluster_domain_suffix
enable_reporting = var.enable_reporting enable_reporting = var.enable_reporting
enable_aggregation = var.enable_aggregation enable_aggregation = var.enable_aggregation
daemonset_tolerations = var.daemonset_tolerations
} }

View File

@ -0,0 +1,211 @@
---
systemd:
units:
- name: etcd-member.service
enabled: true
contents: |
[Unit]
Description=etcd (System Container)
Documentation=https://github.com/etcd-io/etcd
Requires=docker.service
After=docker.service
[Service]
Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.4.16
ExecStartPre=/usr/bin/docker run -d \
--name etcd \
--network host \
--env-file /etc/etcd/etcd.env \
--user 232:232 \
--volume /etc/ssl/etcd:/etc/ssl/certs:ro \
--volume /var/lib/etcd:/var/lib/etcd:rw \
$${ETCD_IMAGE}
ExecStart=docker logs -f etcd
ExecStop=docker stop etcd
ExecStopPost=docker rm etcd
Restart=always
RestartSec=10s
TimeoutStartSec=0
LimitNOFILE=40000
[Install]
WantedBy=multi-user.target
- name: docker.service
enabled: true
- name: locksmithd.service
mask: true
- name: wait-for-dns.service
enabled: true
contents: |
[Unit]
Description=Wait for DNS entries
Wants=systemd-resolved.service
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/sh -c 'while ! /usr/bin/grep '^[^#[:space:]]' /etc/resolv.conf > /dev/null; do sleep 1; done'
[Install]
RequiredBy=kubelet.service
RequiredBy=etcd-member.service
- name: kubelet.service
enabled: true
contents: |
[Unit]
Description=Kubelet (System Container)
Requires=docker.service
After=docker.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.21.1
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=/usr/bin/docker run -d \
--name kubelet \
--privileged \
--pid host \
--network host \
-v /etc/cni/net.d:/etc/cni/net.d:ro \
-v /etc/kubernetes:/etc/kubernetes:ro \
-v /etc/machine-id:/etc/machine-id:ro \
-v /usr/lib/os-release:/etc/os-release:ro \
-v /lib/modules:/lib/modules:ro \
-v /run:/run \
-v /sys/fs/cgroup:/sys/fs/cgroup:ro \
-v /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
-v /var/lib/calico:/var/lib/calico:ro \
-v /var/lib/docker:/var/lib/docker \
-v /var/lib/kubelet:/var/lib/kubelet:rshared \
-v /var/log:/var/log \
-v /opt/cni/bin:/opt/cni/bin \
$${KUBELET_IMAGE} \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--healthz-port=0 \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStart=docker logs -f kubelet
ExecStop=docker stop kubelet
ExecStopPost=docker rm kubelet
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
- name: bootstrap.service
contents: |
[Unit]
Description=Kubernetes control plane
Wants=docker.service
After=docker.service
ConditionPathExists=!/opt/bootstrap/bootstrap.done
[Service]
Type=oneshot
RemainAfterExit=true
WorkingDirectory=/opt/bootstrap
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.21.1
ExecStart=/usr/bin/docker run \
-v /etc/kubernetes/pki:/etc/kubernetes/pki:ro \
-v /opt/bootstrap/assets:/assets:ro \
-v /opt/bootstrap/apply:/apply:ro \
--entrypoint=/apply \
$${KUBELET_IMAGE}
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
[Install]
WantedBy=multi-user.target
storage:
directories:
- path: /var/lib/etcd
filesystem: root
mode: 0700
overwrite: true
files:
- path: /etc/kubernetes/kubeconfig
filesystem: root
mode: 0644
contents:
inline: |
${kubeconfig}
- path: /opt/bootstrap/layout
filesystem: root
mode: 0544
contents:
inline: |
#!/bin/bash -e
mkdir -p -- auth tls/etcd tls/k8s static-manifests manifests/coredns manifests-networking
awk '/#####/ {filename=$2; next} {print > filename}' assets
mkdir -p /etc/ssl/etcd/etcd
mkdir -p /etc/kubernetes/pki
mv tls/etcd/{peer*,server*} /etc/ssl/etcd/etcd/
mv tls/etcd/etcd-client* /etc/kubernetes/pki/
chown -R etcd:etcd /etc/ssl/etcd
chmod -R 500 /etc/ssl/etcd
chmod -R 700 /var/lib/etcd
mv auth/* /etc/kubernetes/pki/
mv tls/k8s/* /etc/kubernetes/pki/
mkdir -p /etc/kubernetes/manifests
mv static-manifests/* /etc/kubernetes/manifests/
mkdir -p /opt/bootstrap/assets
mv manifests /opt/bootstrap/assets/manifests
mv manifests-networking/* /opt/bootstrap/assets/manifests/
rm -rf assets auth static-manifests tls manifests-networking
- path: /opt/bootstrap/apply
filesystem: root
mode: 0544
contents:
inline: |
#!/bin/bash -e
export KUBECONFIG=/etc/kubernetes/pki/admin.conf
until kubectl version; do
echo "Waiting for static pod control plane"
sleep 5
done
until kubectl apply -f /assets/manifests -R; do
echo "Retry applying manifests"
sleep 5
done
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
mode: 0644
contents:
inline: |
fs.inotify.max_user_watches=16184
- path: /etc/etcd/etcd.env
filesystem: root
mode: 0644
contents:
inline: |
ETCD_NAME=${etcd_name}
ETCD_DATA_DIR=/var/lib/etcd
ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379
ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380
ETCD_LISTEN_CLIENT_URLS=https://0.0.0.0:2379
ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380
ETCD_LISTEN_METRICS_URLS=http://0.0.0.0:2381
ETCD_INITIAL_CLUSTER=${etcd_initial_cluster}
ETCD_STRICT_RECONFIG_CHECK=true
ETCD_TRUSTED_CA_FILE=/etc/ssl/certs/etcd/server-ca.crt
ETCD_CERT_FILE=/etc/ssl/certs/etcd/server.crt
ETCD_KEY_FILE=/etc/ssl/certs/etcd/server.key
ETCD_CLIENT_CERT_AUTH=true
ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/certs/etcd/peer-ca.crt
ETCD_PEER_CERT_FILE=/etc/ssl/certs/etcd/peer.crt
ETCD_PEER_KEY_FILE=/etc/ssl/certs/etcd/peer.key
ETCD_PEER_CLIENT_CERT_AUTH=true
passwd:
users:
- name: core
ssh_authorized_keys:
- "${ssh_authorized_key}"

View File

@ -16,9 +16,7 @@ resource "azurerm_dns_a_record" "etcds" {
locals { locals {
# Container Linux derivative # Container Linux derivative
# coreos-stable -> Container Linux Stable
# flatcar-stable -> Flatcar Linux Stable # flatcar-stable -> Flatcar Linux Stable
flavor = split("-", var.os_image)[0]
channel = split("-", var.os_image)[1] channel = split("-", var.os_image)[1]
} }
@ -53,23 +51,18 @@ resource "azurerm_linux_virtual_machine" "controllers" {
storage_account_type = "Premium_LRS" storage_account_type = "Premium_LRS"
} }
# CoreOS Container Linux or Flatcar Container Linux # Flatcar Container Linux
source_image_reference { source_image_reference {
publisher = local.flavor == "flatcar" ? "Kinvolk" : "CoreOS" publisher = "Kinvolk"
offer = local.flavor == "flatcar" ? "flatcar-container-linux-free" : "CoreOS" offer = "flatcar-container-linux-free"
sku = local.channel sku = local.channel
version = "latest" version = "latest"
} }
# Gross hack for Flatcar Linux plan {
dynamic "plan" { name = local.channel
for_each = local.flavor == "flatcar" ? [1] : [] publisher = "kinvolk"
product = "flatcar-container-linux-free"
content {
name = local.channel
publisher = "kinvolk"
product = "flatcar-container-linux-free"
}
} }
# network # network
@ -157,7 +150,6 @@ data "template_file" "controller-configs" {
etcd_domain = "${var.cluster_name}-etcd${count.index}.${var.dns_zone}" etcd_domain = "${var.cluster_name}-etcd${count.index}.${var.dns_zone}"
# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,... # etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...
etcd_initial_cluster = join(",", data.template_file.etcds.*.rendered) etcd_initial_cluster = join(",", data.template_file.etcds.*.rendered)
cgroup_driver = local.flavor == "flatcar" && local.channel == "edge" ? "systemd" : "cgroupfs"
kubeconfig = indent(10, module.bootstrap.kubeconfig-kubelet) kubeconfig = indent(10, module.bootstrap.kubeconfig-kubelet)
ssh_authorized_key = var.ssh_authorized_key ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10) cluster_dns_service_ip = cidrhost(var.service_cidr, 10)

View File

@ -112,16 +112,12 @@ resource "azurerm_lb_outbound_rule" "worker-outbound" {
# Address pool of controllers # Address pool of controllers
resource "azurerm_lb_backend_address_pool" "controller" { resource "azurerm_lb_backend_address_pool" "controller" {
resource_group_name = azurerm_resource_group.cluster.name
name = "controller" name = "controller"
loadbalancer_id = azurerm_lb.cluster.id loadbalancer_id = azurerm_lb.cluster.id
} }
# Address pool of workers # Address pool of workers
resource "azurerm_lb_backend_address_pool" "worker" { resource "azurerm_lb_backend_address_pool" "worker" {
resource_group_name = azurerm_resource_group.cluster.name
name = "worker" name = "worker"
loadbalancer_id = azurerm_lb.cluster.id loadbalancer_id = azurerm_lb.cluster.id
} }

View File

@ -1,5 +1,6 @@
output "kubeconfig-admin" { output "kubeconfig-admin" {
value = module.bootstrap.kubeconfig-admin value = module.bootstrap.kubeconfig-admin
sensitive = true
} }
# Outputs for Kubernetes Ingress # Outputs for Kubernetes Ingress
@ -32,7 +33,8 @@ output "security_group_id" {
} }
output "kubeconfig" { output "kubeconfig" {
value = module.bootstrap.kubeconfig-kubelet value = module.bootstrap.kubeconfig-kubelet
sensitive = true
} }
# Outputs for custom firewalling # Outputs for custom firewalling
@ -57,3 +59,11 @@ output "backend_address_pool_id" {
description = "ID of the worker backend address pool" description = "ID of the worker backend address pool"
value = azurerm_lb_backend_address_pool.worker.id value = azurerm_lb_backend_address_pool.worker.id
} }
# Outputs for debug
output "assets_dist" {
value = module.bootstrap.assets_dist
sensitive = true
}

View File

@ -48,14 +48,19 @@ variable "worker_type" {
variable "os_image" { variable "os_image" {
type = string type = string
description = "Channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge, coreos-stable, coreos-beta, coreos-alpha)" description = "Channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha)"
default = "flatcar-stable" default = "flatcar-stable"
validation {
condition = contains(["flatcar-stable", "flatcar-beta", "flatcar-alpha"], var.os_image)
error_message = "The os_image must be flatcar-stable, flatcar-beta, or flatcar-alpha."
}
} }
variable "disk_size" { variable "disk_size" {
type = number type = number
description = "Size of the disk in GB" description = "Size of the disk in GB"
default = 40 default = 30
} }
variable "worker_priority" { variable "worker_priority" {
@ -130,15 +135,14 @@ variable "worker_node_labels" {
# unofficial, undocumented, unsupported # unofficial, undocumented, unsupported
variable "asset_dir" {
type = string
description = "Absolute path to a directory where generated assets should be placed (contains secrets)"
default = ""
}
variable "cluster_domain_suffix" { variable "cluster_domain_suffix" {
type = string type = string
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) " description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
default = "cluster.local" default = "cluster.local"
} }
variable "daemonset_tolerations" {
type = list(string)
description = "List of additional taint keys kube-system DaemonSets should tolerate (e.g. ['custom-role', 'gpu-role'])"
default = []
}

View File

@ -1,7 +1,7 @@
# Terraform version and plugin versions # Terraform version and plugin versions
terraform { terraform {
required_version = ">= 0.12.26, < 0.14.0" required_version = ">= 0.13.0, < 0.16.0"
required_providers { required_providers {
azurerm = "~> 2.8" azurerm = "~> 2.8"
template = "~> 2.1" template = "~> 2.1"
@ -9,7 +9,7 @@ terraform {
ct = { ct = {
source = "poseidon/ct" source = "poseidon/ct"
version = "~> 0.6.1" version = "~> 0.8"
} }
} }
} }

View File

@ -0,0 +1,118 @@
---
systemd:
units:
- name: docker.service
enabled: true
- name: locksmithd.service
mask: true
- name: wait-for-dns.service
enabled: true
contents: |
[Unit]
Description=Wait for DNS entries
Wants=systemd-resolved.service
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/sh -c 'while ! /usr/bin/grep '^[^#[:space:]]' /etc/resolv.conf > /dev/null; do sleep 1; done'
[Install]
RequiredBy=kubelet.service
- name: kubelet.service
enabled: true
contents: |
[Unit]
Description=Kubelet
Requires=docker.service
After=docker.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.21.1
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
# Podman, rkt, or runc run container processes, whereas docker run
# is a client to a daemon and requires workarounds to use within a
# systemd unit. https://github.com/moby/moby/issues/6791
ExecStartPre=/usr/bin/docker run -d \
--name kubelet \
--privileged \
--pid host \
--network host \
-v /etc/cni/net.d:/etc/cni/net.d:ro \
-v /etc/kubernetes:/etc/kubernetes:ro \
-v /etc/machine-id:/etc/machine-id:ro \
-v /usr/lib/os-release:/etc/os-release:ro \
-v /lib/modules:/lib/modules:ro \
-v /run:/run \
-v /sys/fs/cgroup:/sys/fs/cgroup:ro \
-v /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
-v /var/lib/calico:/var/lib/calico:ro \
-v /var/lib/docker:/var/lib/docker \
-v /var/lib/kubelet:/var/lib/kubelet:rshared \
-v /var/log:/var/log \
-v /opt/cni/bin:/opt/cni/bin \
$${KUBELET_IMAGE} \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--healthz-port=0 \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/node \
%{~ for label in split(",", node_labels) ~}
--node-labels=${label} \
%{~ endfor ~}
%{~ for taint in split(",", node_taints) ~}
--register-with-taints=${taint} \
%{~ endfor ~}
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStart=docker logs -f kubelet
ExecStop=docker stop kubelet
ExecStopPost=docker rm kubelet
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
- name: delete-node.service
enabled: true
contents: |
[Unit]
Description=Delete Kubernetes node on shutdown
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.21.1
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/true
ExecStop=/bin/bash -c '/usr/bin/docker run -v /var/lib/kubelet:/var/lib/kubelet:ro --entrypoint /usr/local/bin/kubectl $${KUBELET_IMAGE} --kubeconfig=/var/lib/kubelet/kubeconfig delete node $HOSTNAME'
[Install]
WantedBy=multi-user.target
storage:
files:
- path: /etc/kubernetes/kubeconfig
filesystem: root
mode: 0644
contents:
inline: |
${kubeconfig}
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
mode: 0644
contents:
inline: |
fs.inotify.max_user_watches=16184
passwd:
users:
- name: core
ssh_authorized_keys:
- "${ssh_authorized_key}"

View File

@ -46,8 +46,13 @@ variable "vm_type" {
variable "os_image" { variable "os_image" {
type = string type = string
description = "Channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge, coreos-stable, coreos-beta, coreos-alpha)" description = "Channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha)"
default = "flatcar-stable" default = "flatcar-stable"
validation {
condition = contains(["flatcar-stable", "flatcar-beta", "flatcar-alpha"], var.os_image)
error_message = "The os_image must be flatcar-stable, flatcar-beta, or flatcar-alpha."
}
} }
variable "priority" { variable "priority" {
@ -89,6 +94,12 @@ variable "node_labels" {
default = [] default = []
} }
variable "node_taints" {
type = list(string)
description = "List of initial node taints"
default = []
}
# unofficial, undocumented, unsupported # unofficial, undocumented, unsupported
variable "cluster_domain_suffix" { variable "cluster_domain_suffix" {

View File

@ -1,14 +1,14 @@
# Terraform version and plugin versions # Terraform version and plugin versions
terraform { terraform {
required_version = ">= 0.12.26, < 0.14.0" required_version = ">= 0.13.0, < 0.16.0"
required_providers { required_providers {
azurerm = "~> 2.8" azurerm = "~> 2.8"
template = "~> 2.1" template = "~> 2.1"
ct = { ct = {
source = "poseidon/ct" source = "poseidon/ct"
version = "~> 0.6.1" version = "~> 0.8"
} }
} }
} }

View File

@ -1,7 +1,5 @@
locals { locals {
# coreos-stable -> Container Linux Stable
# flatcar-stable -> Flatcar Linux Stable # flatcar-stable -> Flatcar Linux Stable
flavor = split("-", var.os_image)[0]
channel = split("-", var.os_image)[1] channel = split("-", var.os_image)[1]
} }
@ -24,23 +22,18 @@ resource "azurerm_linux_virtual_machine_scale_set" "workers" {
caching = "ReadWrite" caching = "ReadWrite"
} }
# CoreOS Container Linux or Flatcar Container Linux # Flatcar Container Linux
source_image_reference { source_image_reference {
publisher = local.flavor == "flatcar" ? "Kinvolk" : "CoreOS" publisher = "Kinvolk"
offer = local.flavor == "flatcar" ? "flatcar-container-linux-free" : "CoreOS" offer = "flatcar-container-linux-free"
sku = local.channel sku = local.channel
version = "latest" version = "latest"
} }
# Gross hack for Flatcar Linux plan {
dynamic "plan" { name = local.channel
for_each = local.flavor == "flatcar" ? [1] : [] publisher = "kinvolk"
product = "flatcar-container-linux-free"
content {
name = local.channel
publisher = "kinvolk"
product = "flatcar-container-linux-free"
}
} }
# Azure requires setting admin_ssh_key, though Ignition custom_data handles it too # Azure requires setting admin_ssh_key, though Ignition custom_data handles it too
@ -111,8 +104,8 @@ data "template_file" "worker-config" {
ssh_authorized_key = var.ssh_authorized_key ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10) cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix cluster_domain_suffix = var.cluster_domain_suffix
cgroup_driver = local.flavor == "flatcar" && local.channel == "edge" ? "systemd" : "cgroupfs"
node_labels = join(",", var.node_labels) node_labels = join(",", var.node_labels)
node_taints = join(",", var.node_taints)
} }
} }

View File

@ -1,221 +0,0 @@
---
systemd:
units:
- name: etcd-member.service
enabled: true
dropins:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.4.12"
Environment="ETCD_IMAGE_URL=docker://quay.io/coreos/etcd"
Environment="RKT_RUN_ARGS=--insecure-options=image"
Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${domain_name}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${domain_name}:2380"
Environment="ETCD_LISTEN_CLIENT_URLS=https://0.0.0.0:2379"
Environment="ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380"
Environment="ETCD_LISTEN_METRICS_URLS=http://0.0.0.0:2381"
Environment="ETCD_INITIAL_CLUSTER=${etcd_initial_cluster}"
Environment="ETCD_STRICT_RECONFIG_CHECK=true"
Environment="ETCD_SSL_DIR=/etc/ssl/etcd"
Environment="ETCD_TRUSTED_CA_FILE=/etc/ssl/certs/etcd/server-ca.crt"
Environment="ETCD_CERT_FILE=/etc/ssl/certs/etcd/server.crt"
Environment="ETCD_KEY_FILE=/etc/ssl/certs/etcd/server.key"
Environment="ETCD_CLIENT_CERT_AUTH=true"
Environment="ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/certs/etcd/peer-ca.crt"
Environment="ETCD_PEER_CERT_FILE=/etc/ssl/certs/etcd/peer.crt"
Environment="ETCD_PEER_KEY_FILE=/etc/ssl/certs/etcd/peer.key"
Environment="ETCD_PEER_CLIENT_CERT_AUTH=true"
- name: docker.service
enabled: true
- name: locksmithd.service
mask: true
- name: kubelet.path
enabled: true
contents: |
[Unit]
Description=Watch for kubeconfig
[Path]
PathExists=/etc/kubernetes/kubeconfig
[Install]
WantedBy=multi-user.target
- name: wait-for-dns.service
enabled: true
contents: |
[Unit]
Description=Wait for DNS entries
Wants=systemd-resolved.service
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/sh -c 'while ! /usr/bin/grep '^[^#[:space:]]' /etc/resolv.conf > /dev/null; do sleep 1; done'
[Install]
RequiredBy=kubelet.service
RequiredBy=etcd-member.service
- name: kubelet.service
contents: |
[Unit]
Description=Kubelet
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.19.2
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/bin/rkt run \
--uuid-file-save=/var/cache/kubelet-pod.uuid \
--stage1-from-dir=stage1-fly.aci \
--hosts-entry host \
--insecure-options=image \
--volume etc-kubernetes,kind=host,source=/etc/kubernetes,readOnly=true \
--mount volume=etc-kubernetes,target=/etc/kubernetes \
--volume etc-machine-id,kind=host,source=/etc/machine-id,readOnly=true \
--mount volume=etc-machine-id,target=/etc/machine-id \
--volume etc-os-release,kind=host,source=/usr/lib/os-release,readOnly=true \
--mount volume=etc-os-release,target=/etc/os-release \
--volume=etc-resolv,kind=host,source=/etc/resolv.conf,readOnly=true \
--mount volume=etc-resolv,target=/etc/resolv.conf \
--volume etc-ssl-certs,kind=host,source=/etc/ssl/certs,readOnly=true \
--mount volume=etc-ssl-certs,target=/etc/ssl/certs \
--volume lib-modules,kind=host,source=/lib/modules,readOnly=true \
--mount volume=lib-modules,target=/lib/modules \
--volume run,kind=host,source=/run \
--mount volume=run,target=/run \
--volume usr-share-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true \
--mount volume=usr-share-certs,target=/usr/share/ca-certificates \
--volume var-lib-calico,kind=host,source=/var/lib/calico,readOnly=true \
--mount volume=var-lib-calico,target=/var/lib/calico \
--volume var-lib-docker,kind=host,source=/var/lib/docker \
--mount volume=var-lib-docker,target=/var/lib/docker \
--volume var-lib-kubelet,kind=host,source=/var/lib/kubelet,recursive=true \
--mount volume=var-lib-kubelet,target=/var/lib/kubelet \
--volume var-log,kind=host,source=/var/log \
--mount volume=var-log,target=/var/log \
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
--mount volume=opt-cni-bin,target=/opt/cni/bin \
--volume etc-iscsi,kind=host,source=/etc/iscsi \
--mount volume=etc-iscsi,target=/etc/iscsi \
--volume usr-sbin-iscsiadm,kind=host,source=/usr/sbin/iscsiadm \
--mount volume=usr-sbin-iscsiadm,target=/sbin/iscsiadm \
$${KUBELET_IMAGE} -- \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \
--hostname-override=${domain_name} \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
- name: bootstrap.service
contents: |
[Unit]
Description=Kubernetes control plane
ConditionPathExists=!/opt/bootstrap/bootstrap.done
[Service]
Type=oneshot
RemainAfterExit=true
WorkingDirectory=/opt/bootstrap
ExecStart=/usr/bin/rkt run \
--trust-keys-from-https \
--volume config,kind=host,source=/etc/kubernetes/bootstrap-secrets \
--mount volume=config,target=/etc/kubernetes/secrets \
--volume assets,kind=host,source=/opt/bootstrap/assets \
--mount volume=assets,target=/assets \
--volume script,kind=host,source=/opt/bootstrap/apply \
--mount volume=script,target=/apply \
--insecure-options=image \
docker://quay.io/poseidon/kubelet:v1.19.2 \
--net=host \
--dns=host \
--exec=/apply
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
[Install]
WantedBy=multi-user.target
storage:
directories:
- path: /var/lib/etcd
filesystem: root
mode: 0700
overwrite: true
- path: /etc/kubernetes
filesystem: root
mode: 0755
files:
- path: /etc/hostname
filesystem: root
mode: 0644
contents:
inline:
${domain_name}
- path: /opt/bootstrap/layout
filesystem: root
mode: 0544
contents:
inline: |
#!/bin/bash -e
mkdir -p -- auth tls/etcd tls/k8s static-manifests manifests/coredns manifests-networking
awk '/#####/ {filename=$2; next} {print > filename}' assets
mkdir -p /etc/ssl/etcd/etcd
mkdir -p /etc/kubernetes/bootstrap-secrets
mv tls/etcd/{peer*,server*} /etc/ssl/etcd/etcd/
mv tls/etcd/etcd-client* /etc/kubernetes/bootstrap-secrets/
chown -R etcd:etcd /etc/ssl/etcd
chmod -R 500 /etc/ssl/etcd
chmod -R 700 /var/lib/etcd
mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/
mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/
mkdir -p /etc/kubernetes/manifests
mv static-manifests/* /etc/kubernetes/manifests/
mkdir -p /opt/bootstrap/assets
mv manifests /opt/bootstrap/assets/manifests
mv manifests-networking/* /opt/bootstrap/assets/manifests/
rm -rf assets auth static-manifests tls manifests-networking
- path: /opt/bootstrap/apply
filesystem: root
mode: 0544
contents:
inline: |
#!/bin/bash -e
export KUBECONFIG=/etc/kubernetes/secrets/kubeconfig
until kubectl version; do
echo "Waiting for static pod control plane"
sleep 5
done
until kubectl apply -f /assets/manifests -R; do
echo "Retry applying manifests"
sleep 5
done
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
mode: 0644
contents:
inline: |
fs.inotify.max_user_watches=16184
passwd:
users:
- name: core
ssh_authorized_keys:
- ${ssh_authorized_key}

View File

@ -1,4 +0,0 @@
output "kubeconfig-admin" {
value = module.bootstrap.kubeconfig-admin
}

View File

@ -11,10 +11,10 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a> ## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.19.2 (upstream) * Kubernetes v1.21.1 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking * Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization * Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#hosts) customization
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/) * Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
## Docs ## Docs

View File

@ -1,11 +1,10 @@
# Kubernetes assets (kubeconfig, manifests) # Kubernetes assets (kubeconfig, manifests)
module "bootstrap" { module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=d0f2123c5971410dc14aecde2307eb13e89c2bdf" source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=ebe3d5526a59b34c8f119a206358b0c0a6f6f67d"
cluster_name = var.cluster_name cluster_name = var.cluster_name
api_servers = [var.k8s_domain_name] api_servers = [var.k8s_domain_name]
etcd_servers = var.controllers.*.domain etcd_servers = var.controllers.*.domain
asset_dir = var.asset_dir
networking = var.networking networking = var.networking
network_mtu = var.network_mtu network_mtu = var.network_mtu
network_ip_autodetection_method = var.network_ip_autodetection_method network_ip_autodetection_method = var.network_ip_autodetection_method

View File

@ -1,6 +1,6 @@
--- ---
variant: fcos variant: fcos
version: 1.1.0 version: 1.2.0
systemd: systemd:
units: units:
- name: etcd-member.service - name: etcd-member.service
@ -8,28 +8,25 @@ systemd:
contents: | contents: |
[Unit] [Unit]
Description=etcd (System Container) Description=etcd (System Container)
Documentation=https://github.com/coreos/etcd Documentation=https://github.com/etcd-io/etcd
Wants=network-online.target network.target Wants=network-online.target network.target
After=network-online.target After=network-online.target
[Service] [Service]
# https://github.com/opencontainers/runc/pull/1807 Environment=ETCD_IMAGE=quay.io/coreos/etcd:v3.4.16
# Type=notify
# NotifyAccess=exec
Type=exec Type=exec
Restart=on-failure
RestartSec=10s
TimeoutStartSec=0
LimitNOFILE=40000
ExecStartPre=/bin/mkdir -p /var/lib/etcd ExecStartPre=/bin/mkdir -p /var/lib/etcd
ExecStartPre=-/usr/bin/podman rm etcd ExecStartPre=-/usr/bin/podman rm etcd
#--volume $${NOTIFY_SOCKET}:/run/systemd/notify \
ExecStart=/usr/bin/podman run --name etcd \ ExecStart=/usr/bin/podman run --name etcd \
--env-file /etc/etcd/etcd.env \ --env-file /etc/etcd/etcd.env \
--network host \ --network host \
--volume /var/lib/etcd:/var/lib/etcd:rw,Z \ --volume /var/lib/etcd:/var/lib/etcd:rw,Z \
--volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \ --volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \
quay.io/coreos/etcd:v3.4.12 $${ETCD_IMAGE}
ExecStop=/usr/bin/podman stop etcd ExecStop=/usr/bin/podman stop etcd
Restart=on-failure
RestartSec=10s
TimeoutStartSec=0
LimitNOFILE=40000
[Install] [Install]
WantedBy=multi-user.target WantedBy=multi-user.target
- name: docker.service - name: docker.service
@ -53,8 +50,8 @@ systemd:
Description=Kubelet (System Container) Description=Kubelet (System Container)
Wants=rpc-statd.service Wants=rpc-statd.service
[Service] [Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.19.2 Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.21.1
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico ExecStartPre=/bin/mkdir -p /var/lib/calico
@ -65,22 +62,18 @@ systemd:
--privileged \ --privileged \
--pid host \ --pid host \
--network host \ --network host \
--volume /etc/cni/net.d:/etc/cni/net.d:ro,z \
--volume /etc/kubernetes:/etc/kubernetes:ro,z \ --volume /etc/kubernetes:/etc/kubernetes:ro,z \
--volume /usr/lib/os-release:/etc/os-release:ro \ --volume /usr/lib/os-release:/etc/os-release:ro \
--volume /etc/ssl/certs:/etc/ssl/certs:ro \
--volume /lib/modules:/lib/modules:ro \ --volume /lib/modules:/lib/modules:ro \
--volume /run:/run \ --volume /run:/run \
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \ --volume /sys/fs/cgroup:/sys/fs/cgroup \
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
--volume /etc/pki/tls/certs:/usr/share/ca-certificates:ro \
--volume /var/lib/calico:/var/lib/calico:ro \ --volume /var/lib/calico:/var/lib/calico:ro \
--volume /var/lib/docker:/var/lib/docker \ --volume /var/lib/docker:/var/lib/docker \
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \ --volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
--volume /var/log:/var/log \ --volume /var/log:/var/log \
--volume /var/run/lock:/var/run/lock:z \ --volume /var/run/lock:/var/run/lock:z \
--volume /opt/cni/bin:/opt/cni/bin:z \ --volume /opt/cni/bin:/opt/cni/bin:z \
--volume /etc/iscsi:/etc/iscsi \
--volume /sbin/iscsiadm:/sbin/iscsiadm \
$${KUBELET_IMAGE} \ $${KUBELET_IMAGE} \
--anonymous-auth=false \ --anonymous-auth=false \
--authentication-token-webhook \ --authentication-token-webhook \
@ -92,7 +85,6 @@ systemd:
--client-ca-file=/etc/kubernetes/ca.crt \ --client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \ --cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \ --cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \ --healthz-port=0 \
--hostname-override=${domain_name} \ --hostname-override=${domain_name} \
--kubeconfig=/var/lib/kubelet/kubeconfig \ --kubeconfig=/var/lib/kubelet/kubeconfig \
@ -127,14 +119,15 @@ systemd:
Type=oneshot Type=oneshot
RemainAfterExit=true RemainAfterExit=true
WorkingDirectory=/opt/bootstrap WorkingDirectory=/opt/bootstrap
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.21.1
ExecStartPre=-/usr/bin/podman rm bootstrap ExecStartPre=-/usr/bin/podman rm bootstrap
ExecStart=/usr/bin/podman run --name bootstrap \ ExecStart=/usr/bin/podman run --name bootstrap \
--network host \ --network host \
--volume /etc/kubernetes/bootstrap-secrets:/etc/kubernetes/secrets:ro,z \ --volume /etc/kubernetes/pki:/etc/kubernetes/pki:ro,z \
--volume /opt/bootstrap/assets:/assets:ro,Z \ --volume /opt/bootstrap/assets:/assets:ro,Z \
--volume /opt/bootstrap/apply:/apply:ro,Z \ --volume /opt/bootstrap/apply:/apply:ro,Z \
--entrypoint=/apply \ --entrypoint=/apply \
quay.io/poseidon/kubelet:v1.19.2 $${KUBELET_IMAGE}
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
ExecStartPost=-/usr/bin/podman stop bootstrap ExecStartPost=-/usr/bin/podman stop bootstrap
storage: storage:
@ -157,26 +150,26 @@ storage:
mkdir -p -- auth tls/etcd tls/k8s static-manifests manifests/coredns manifests-networking mkdir -p -- auth tls/etcd tls/k8s static-manifests manifests/coredns manifests-networking
awk '/#####/ {filename=$2; next} {print > filename}' assets awk '/#####/ {filename=$2; next} {print > filename}' assets
mkdir -p /etc/ssl/etcd/etcd mkdir -p /etc/ssl/etcd/etcd
mkdir -p /etc/kubernetes/bootstrap-secrets mkdir -p /etc/kubernetes/pki
mv tls/etcd/{peer*,server*} /etc/ssl/etcd/etcd/ mv tls/etcd/{peer*,server*} /etc/ssl/etcd/etcd/
mv tls/etcd/etcd-client* /etc/kubernetes/bootstrap-secrets/ mv tls/etcd/etcd-client* /etc/kubernetes/pki/
chown -R etcd:etcd /etc/ssl/etcd chown -R etcd:etcd /etc/ssl/etcd
chmod -R 500 /etc/ssl/etcd chmod -R 500 /etc/ssl/etcd
mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/ mv auth/* /etc/kubernetes/pki/
mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/ mv tls/k8s/* /etc/kubernetes/pki/
mkdir -p /etc/kubernetes/manifests mkdir -p /etc/kubernetes/manifests
mv static-manifests/* /etc/kubernetes/manifests/ mv static-manifests/* /etc/kubernetes/manifests/
mkdir -p /opt/bootstrap/assets mkdir -p /opt/bootstrap/assets
mv manifests /opt/bootstrap/assets/manifests mv manifests /opt/bootstrap/assets/manifests
mv manifests-networking/* /opt/bootstrap/assets/manifests/ mv manifests-networking/* /opt/bootstrap/assets/manifests/
rm -rf assets auth static-manifests tls manifests-networking rm -rf assets auth static-manifests tls manifests-networking
chcon -R -u system_u -t container_file_t /etc/kubernetes/bootstrap-secrets chcon -R -u system_u -t container_file_t /etc/kubernetes/pki
- path: /opt/bootstrap/apply - path: /opt/bootstrap/apply
mode: 0544 mode: 0544
contents: contents:
inline: | inline: |
#!/bin/bash -e #!/bin/bash -e
export KUBECONFIG=/etc/kubernetes/secrets/kubeconfig export KUBECONFIG=/etc/kubernetes/pki/admin.conf
until kubectl version; do until kubectl version; do
echo "Waiting for static pod control plane" echo "Waiting for static pod control plane"
sleep 5 sleep 5
@ -212,8 +205,6 @@ storage:
mode: 0644 mode: 0644
contents: contents:
inline: | inline: |
# TODO: Use a systemd dropin once podman v1.4.5 is avail.
NOTIFY_SOCKET=/run/systemd/notify
ETCD_NAME=${etcd_name} ETCD_NAME=${etcd_name}
ETCD_DATA_DIR=/var/lib/etcd ETCD_DATA_DIR=/var/lib/etcd
ETCD_ADVERTISE_CLIENT_URLS=https://${domain_name}:2379 ETCD_ADVERTISE_CLIENT_URLS=https://${domain_name}:2379

View File

@ -1,6 +1,6 @@
--- ---
variant: fcos variant: fcos
version: 1.1.0 version: 1.2.0
systemd: systemd:
units: units:
- name: docker.service - name: docker.service
@ -23,8 +23,8 @@ systemd:
Description=Kubelet (System Container) Description=Kubelet (System Container)
Wants=rpc-statd.service Wants=rpc-statd.service
[Service] [Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.19.2 Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.21.1
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico ExecStartPre=/bin/mkdir -p /var/lib/calico
@ -35,22 +35,18 @@ systemd:
--privileged \ --privileged \
--pid host \ --pid host \
--network host \ --network host \
--volume /etc/cni/net.d:/etc/cni/net.d:ro,z \
--volume /etc/kubernetes:/etc/kubernetes:ro,z \ --volume /etc/kubernetes:/etc/kubernetes:ro,z \
--volume /usr/lib/os-release:/etc/os-release:ro \ --volume /usr/lib/os-release:/etc/os-release:ro \
--volume /etc/ssl/certs:/etc/ssl/certs:ro \
--volume /lib/modules:/lib/modules:ro \ --volume /lib/modules:/lib/modules:ro \
--volume /run:/run \ --volume /run:/run \
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \ --volume /sys/fs/cgroup:/sys/fs/cgroup \
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
--volume /etc/pki/tls/certs:/usr/share/ca-certificates:ro \
--volume /var/lib/calico:/var/lib/calico:ro \ --volume /var/lib/calico:/var/lib/calico:ro \
--volume /var/lib/docker:/var/lib/docker \ --volume /var/lib/docker:/var/lib/docker \
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \ --volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
--volume /var/log:/var/log \ --volume /var/log:/var/log \
--volume /var/run/lock:/var/run/lock:z \ --volume /var/run/lock:/var/run/lock:z \
--volume /opt/cni/bin:/opt/cni/bin:z \ --volume /opt/cni/bin:/opt/cni/bin:z \
--volume /etc/iscsi:/etc/iscsi \
--volume /sbin/iscsiadm:/sbin/iscsiadm \
$${KUBELET_IMAGE} \ $${KUBELET_IMAGE} \
--anonymous-auth=false \ --anonymous-auth=false \
--authentication-token-webhook \ --authentication-token-webhook \
@ -62,7 +58,6 @@ systemd:
--client-ca-file=/etc/kubernetes/ca.crt \ --client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \ --cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \ --cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--healthz-port=0 \ --healthz-port=0 \
--hostname-override=${domain_name} \ --hostname-override=${domain_name} \
--kubeconfig=/var/lib/kubelet/kubeconfig \ --kubeconfig=/var/lib/kubelet/kubeconfig \

View File

@ -1,4 +1,12 @@
output "kubeconfig-admin" { output "kubeconfig-admin" {
value = module.bootstrap.kubeconfig-admin value = module.bootstrap.kubeconfig-admin
sensitive = true
}
# Outputs for debug
output "assets_dist" {
value = module.bootstrap.assets_dist
sensitive = true
} }

Some files were not shown because too many files have changed in this diff Show More