mirror of
https://github.com/puppetmaster/typhoon.git
synced 2025-08-02 21:21:34 +02:00
Compare commits
88 Commits
Author | SHA1 | Date | |
---|---|---|---|
70bdc9ec94 | |||
144bb9403c | |||
5fca08064b | |||
fc686c8fc7 | |||
a1a5da6bc2 | |||
076b8e3c42 | |||
ef5f953e04 | |||
d25f23e675 | |||
f100a90d28 | |||
c3bf8bcf96 | |||
5d1e4ad333 | |||
9f702c72d2 | |||
e556bc2167 | |||
1bf4f3b801 | |||
590d941f50 | |||
ddc1ff5348 | |||
61557e89a6 | |||
c3ef21dbf5 | |||
2a5dddeb9d | |||
75fb4e5d11 | |||
1a139ef6f1 | |||
bc7902f40a | |||
70bf39bb9a | |||
4e1b8f22df | |||
ab7913a061 | |||
7b0ea23cdc | |||
c4683c5bad | |||
51cee6d5a4 | |||
87f9a2fc35 | |||
6de5cf5a55 | |||
3250994c95 | |||
f4d260645c | |||
d9219a6722 | |||
60c7eb85ee | |||
4c964b56a0 | |||
1fbd6835f2 | |||
e4d977bfcd | |||
947c2c1815 | |||
4a38fb5927 | |||
c4e64a9d1b | |||
7ca03e5219 | |||
362b3fac5c | |||
32db59b9eb | |||
0c53ad52e4 | |||
008817b0aa | |||
49d3b9e6b3 | |||
1243f395d1 | |||
846f11097f | |||
ba84f86dc7 | |||
b49a1d715d | |||
34c3d7cc39 | |||
ca96a1335c | |||
e339fbd2b6 | |||
8cc303c9ac | |||
b19ba16afa | |||
d127a7345c | |||
02a470d2f2 | |||
5643ad525f | |||
d5b7ce8f27 | |||
1cda5bcd2a | |||
bda73264f7 | |||
dd930a2ff9 | |||
03ff3a9cf3 | |||
48703f9906 | |||
7ddd3d096d | |||
7daabd28b5 | |||
b642e3b41b | |||
ac786a2efc | |||
073fcb7067 | |||
ce0569e03b | |||
0e2fc89f78 | |||
b1f521fc4a | |||
73588cfad3 | |||
0223b31e1a | |||
bb586b60da | |||
43e05b9131 | |||
b2eb3e05d0 | |||
f1f4cd6fc0 | |||
50db3d0231 | |||
11565ffa8a | |||
a4e843693f | |||
f48e43c0b1 | |||
daa8d9d9ec | |||
52d11096dc | |||
00c431a9d2 | |||
0ecb995890 | |||
1b9fa2e688 | |||
2d8e367664 |
2
.github/ISSUE_TEMPLATE.md
vendored
2
.github/ISSUE_TEMPLATE.md
vendored
@ -5,7 +5,7 @@
|
||||
### Environment
|
||||
|
||||
* Platform: aws, azure, bare-metal, google-cloud, digital-ocean
|
||||
* OS: container-linux, flatcar-linux
|
||||
* OS: fedora-coreos, flatcar-linux
|
||||
* Release: Typhoon version or Git SHA (reporting latest is **not** helpful)
|
||||
* Terraform: `terraform version` (reporting latest is **not** helpful)
|
||||
* Plugins: Provider plugin versions (reporting latest is **not** helpful)
|
||||
|
190
CHANGES.md
190
CHANGES.md
@ -4,6 +4,178 @@ Notable changes between versions.
|
||||
|
||||
## Latest
|
||||
|
||||
## v1.18.0
|
||||
|
||||
* Kubernetes [v1.18.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1180)
|
||||
* Update etcd from v3.4.4 to [v3.4.5](https://github.com/etcd-io/etcd/releases/tag/v3.4.5)
|
||||
* Switch from upstream hyperkube image to individual images ([#669](https://github.com/poseidon/typhoon/pull/669))
|
||||
* Use upstream k8s.gcr.io `kube-apiserver`, `kube-controller-manager`, `kube-scheduler`, and `kube-proxy` container images
|
||||
* Use [poseidon/kubelet](https://github.com/poseidon/kubelet) to package the upstream Kubelet binary and dependencies as a container image (checksummed, automated build)
|
||||
* Add [quay.io/poseidon/kubelet](https://quay.io/repository/poseidon/kubelet) as a Typhoon distributed artifact in the security policy
|
||||
* Update base images from debian 9 to debian 10
|
||||
* Background: Kubernetes will [stop releasing](https://github.com/kubernetes/kubernetes/pull/88676) the hyperkube container image and provide the Kubelet as a binary for packaging
|
||||
* Choose Fedora CoreOS or Flatcar Linux (**action recommended**)
|
||||
* Use a `fedora-coreos` module for Fedora CoreOS
|
||||
* Use a `container-linux` module with OS set for Flatcar Linux (varies, see docs)
|
||||
* CoreOS Container Linux [won't receive updates](https://coreos.com/os/eol/) after May 2020
|
||||
* Add support for Fedora CoreOS snippets (`terraform-provider-ct` v0.5+) ([#686](https://github.com/poseidon/typhoon/pull/686))
|
||||
* Recommend updating `terraform-provider-ct` plugin from v0.4.0 to [v0.5.0](https://github.com/poseidon/terraform-provider-ct/releases/tag/v0.5.0)
|
||||
* Set Fedora CoreOS log driver back to the default `journald` ([#681](https://github.com/poseidon/typhoon/pull/681))
|
||||
* Deprecate `asset_dir` variable and remove docs ([#678](https://github.com/poseidon/typhoon/pull/678))
|
||||
* Deprecate support for [gitRepo](https://kubernetes.io/docs/concepts/storage/volumes/#gitrepo) volumes. A future release will drop support.
|
||||
|
||||
#### AWS
|
||||
|
||||
* Fix Fedora CoreOS AMI to filter for stable images ([#685](https://github.com/poseidon/typhoon/pull/685))
|
||||
* Latest Fedora CoreOS `testing` or `bodhi-update` images could be chosen depending on the region
|
||||
|
||||
#### Bare-Metal
|
||||
|
||||
* Update default `os_stream` from testing to stable
|
||||
|
||||
#### Google Cloud
|
||||
|
||||
* Known: Use of stale Fedora CoreOS image may require terraform re-apply during bootstrap ([#687](https://github.com/poseidon/typhoon/pull/687))
|
||||
|
||||
#### DigitalOcean
|
||||
|
||||
* Rename `image` variable to `os_image` for consistency ([#677](https://github.com/poseidon/typhoon/pull/677)) (action required)
|
||||
|
||||
#### Addons
|
||||
|
||||
* Update Prometheus from v2.16.0 to [v2.17.1](https://github.com/prometheus/prometheus/releases/tag/v2.17.1)
|
||||
* Update Grafana from v6.6.2 to [v6.7.1](https://github.com/grafana/grafana/releases/tag/v6.7.1)
|
||||
|
||||
## v1.17.4
|
||||
|
||||
* Kubernetes [v1.17.4](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.17.md#v1174)
|
||||
* Update etcd from v3.4.3 to [v3.4.4](https://github.com/etcd-io/etcd/releases/tag/v3.4.4)
|
||||
* On Container Linux, fetch using the docker transport format ([#659](https://github.com/poseidon/typhoon/pull/659))
|
||||
* Update CoreDNS from v1.6.6 to v1.6.7 ([#648](https://github.com/poseidon/typhoon/pull/648))
|
||||
* Update Calico from v3.12.0 to [v3.13.1](https://docs.projectcalico.org/v3.13/release-notes/)
|
||||
|
||||
#### AWS
|
||||
|
||||
* Promote Fedora CoreOS to stable ([#668](https://github.com/poseidon/typhoon/pull/668))
|
||||
* Allow VPC route table extension via reference ([#654](https://github.com/poseidon/typhoon/pull/654))
|
||||
* Fix `worker_node_labels` on Fedora CoreOS ([#651](https://github.com/poseidon/typhoon/pull/651))
|
||||
* Fix automatic worker node delete on shutdown on Fedora CoreOS ([#657](https://github.com/poseidon/typhoon/pull/657))
|
||||
|
||||
#### Azure
|
||||
|
||||
* Upgrade to `terraform-provider-azurerm` [v2.0+](https://www.terraform.io/docs/providers/azurerm/guides/2.0-upgrade-guide.html) (action required)
|
||||
* Change `worker_priority` from `Low` to `Spot` if used (action required)
|
||||
* Switch to Azure's new Linux VM and Linux VM Scale Set resources
|
||||
* Set controller's Azure disk caching to None
|
||||
* Associate subnets (in addition to NICs) with security groups (aesthetic)
|
||||
* Add support for Flatcar Container Linux ([#664](https://github.com/poseidon/typhoon/pull/664))
|
||||
* Requires accepting Flatcar Linux Azure Marketplace terms
|
||||
|
||||
#### Bare-Metal
|
||||
|
||||
* Add `worker_node_labels` map variable for per-worker node labels ([#663](https://github.com/poseidon/typhoon/pull/663))
|
||||
* Add `worker_node_taints` map variable for per-worker node taints ([#663](https://github.com/poseidon/typhoon/pull/663))
|
||||
|
||||
#### DigitalOcean
|
||||
|
||||
* Add support for Flatcar Container Linux ([#644](https://github.com/poseidon/typhoon/pull/644))
|
||||
|
||||
#### Google Cloud
|
||||
|
||||
* Promote Fedora CoreOS to beta ([#668](https://github.com/poseidon/typhoon/pull/668))
|
||||
* Fix `worker_node_labels` on Fedora CoreOS ([#651](https://github.com/poseidon/typhoon/pull/651))
|
||||
* Fix automatic worker node delete on shutdown on Fedora CoreOS ([#657](https://github.com/poseidon/typhoon/pull/657))
|
||||
|
||||
#### Addons
|
||||
|
||||
* Update nginx-ingress from v0.28.0 to [v0.30.0](https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.30.0)
|
||||
* Update Prometheus from v2.15.2 to [v2.16.0](https://github.com/prometheus/prometheus/releases/tag/v2.16.0)
|
||||
* Refresh Prometheus rules and alerts
|
||||
* Add a BlackboxProbeFailure alert
|
||||
* Update kube-state-metrics from v1.9.4 to v1.9.5
|
||||
* Update node-exporter from v0.18.1 to [v1.0.0-rc.0](https://github.com/prometheus/node_exporter/releases/tag/v1.0.0-rc.0)
|
||||
* Update Grafana from v6.6.1 to v6.6.2
|
||||
* Refresh Grafana dashboards
|
||||
* Remove Container Linux Update Operator (CLUO) addon example ([#667](https://github.com/poseidon/typhoon/pull/667))
|
||||
* CLUO hasn't been in active use in our clusters and won't be relevant
|
||||
beyond Container Linux. Requires patches for use on Kubernetes v1.16+
|
||||
|
||||
## v1.17.3
|
||||
|
||||
* Kubernetes [v1.17.3](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.17.md#v1173)
|
||||
* Update Calico from v3.11.2 to v3.12.0
|
||||
* Allow Fedora CoreOS clusters to pass CNCF conformance suite
|
||||
* Set Docker log driver to `json-file` as a workaround
|
||||
* Try Fedora CoreOS or Flatcar Linux alongside CoreOS [Container Linux](https://coreos.com/os/eol/) clusters (recommended)
|
||||
|
||||
#### AWS
|
||||
|
||||
* Promote Fedora CoreOS to beta ([#645](https://github.com/poseidon/typhoon/pull/645))
|
||||
|
||||
#### Bare-Metal
|
||||
|
||||
* Promote Fedora CoreOS to beta ([#645](https://github.com/poseidon/typhoon/pull/645))
|
||||
* Add Fedora CoreOS kernel arguments initrd and console ([#640](https://github.com/poseidon/typhoon/pull/640))
|
||||
|
||||
#### Google Cloud
|
||||
|
||||
* Add Terraform module for Fedora CoreOS ([#632](https://github.com/poseidon/typhoon/pull/632))
|
||||
* Add support for Flatcar Container Linux ([#639](https://github.com/poseidon/typhoon/pull/639))
|
||||
|
||||
#### Addons
|
||||
|
||||
* Update nginx-ingress from v0.27.1 to v0.28.0
|
||||
* Update kube-state-metrics from v1.9.3 to v1.9.4
|
||||
* Update Grafana from v6.5.3 to v6.6.1
|
||||
|
||||
## v1.17.2
|
||||
|
||||
* Kubernetes [v1.17.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.17.md#v1172)
|
||||
|
||||
#### AWS
|
||||
|
||||
* Promote Fedora CoreOS from preview to alpha
|
||||
|
||||
#### Bare-Metal
|
||||
|
||||
* Promote Fedora CoreOS from preview to alpha
|
||||
* Update Fedora CoreOS images location
|
||||
* Use Fedora CoreOS production [download](https://getfedora.org/coreos/download/) streams
|
||||
* Use live PXE kernel and initramfs images
|
||||
|
||||
#### Addons
|
||||
|
||||
* Update nginx-ingress from v0.26.1 to [v0.27.1](https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.27.1) ([#625](https://github.com/poseidon/typhoon/pull/625))
|
||||
* Change runAsUser from 33 to 101 for alpine-based image
|
||||
* Update kube-state-metrics from v1.9.2 to v1.9.3
|
||||
|
||||
## v1.17.1
|
||||
|
||||
* Kubernetes [v1.17.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.17.md#v1171)
|
||||
* Update CoreDNS from v1.6.5 to [v1.6.6](https://coredns.io/2019/12/11/coredns-1.6.6-release/) ([#602](https://github.com/poseidon/typhoon/pull/602))
|
||||
* Update Calico from v3.10.2 to v3.11.2 ([#604](https://github.com/poseidon/typhoon/pull/604))
|
||||
* Inline Kubelet service on Container Linux nodes ([#606](https://github.com/poseidon/typhoon/pull/606))
|
||||
* Disable unused Kubelet `127.0.0.1:10248` healthz listener ([#607](https://github.com/poseidon/typhoon/pull/607))
|
||||
* Enable kube-proxy metrics and allow Prometheus scrapes
|
||||
* Allow TCP/10249 traffic with worker node sources
|
||||
|
||||
#### AWS
|
||||
|
||||
* Update Fedora CoreOS AMI filter for fedora-coreos-31 ([#620](https://github.com/poseidon/typhoon/pull/620))
|
||||
|
||||
#### Google
|
||||
|
||||
* Allow `terraform-provider-google` v3.0+ ([#617](https://github.com/poseidon/typhoon/pull/617))
|
||||
* Only enforce `v2.19+` to ease migration, as no v3.x features are used
|
||||
|
||||
#### Addons
|
||||
|
||||
* Update Prometheus from v2.14.0 to [v2.15.2](https://github.com/prometheus/prometheus/releases/tag/v2.15.2)
|
||||
* Add discovery for kube-proxy service endpoints
|
||||
* Update kube-state-metrics from v1.8.0 to v1.9.2
|
||||
* Reduce node-exporter DaemonSet tolerations ([#614](https://github.com/poseidon/typhoon/pull/614))
|
||||
* Update Grafana from v6.5.1 to v6.5.3
|
||||
|
||||
## v1.17.0
|
||||
|
||||
* Kubernetes [v1.17.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.17.md#v1170)
|
||||
@ -226,7 +398,7 @@ Notable changes between versions.
|
||||
* Require `terraform-provider-azurerm` v1.27+ to support Terraform v0.12 (action required)
|
||||
* Avoid unneeded rotations of Regular priority virtual machine scale sets
|
||||
* Azure only allows `eviction_policy` to be set for Low priority VMs. Supporting Low priority VMs meant when Regular VMs were used, each `terraform apply` rolled workers, to set eviction_policy to null.
|
||||
* Terraform v0.12 nullable variables fix the issue so plan does not produce a diff.
|
||||
* Terraform v0.12 nullable variables fix the issue so plan does not produce a diff.
|
||||
|
||||
#### Bare-Metal
|
||||
|
||||
@ -281,7 +453,7 @@ Notable changes between versions.
|
||||
* Update Grafana from v6.1.6 to v6.2.1
|
||||
|
||||
## v1.14.2
|
||||
|
||||
|
||||
* Kubernetes [v1.14.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md#v1142)
|
||||
* Update etcd from v3.3.12 to [v3.3.13](https://github.com/etcd-io/etcd/releases/tag/v3.3.13)
|
||||
* Upgrade Calico from v3.6.1 to [v3.7.2](https://docs.projectcalico.org/v3.7/release-notes/)
|
||||
@ -352,7 +524,7 @@ Notable changes between versions.
|
||||
|
||||
* Add ability to load balance TCP/UDP applications ([#442](https://github.com/poseidon/typhoon/pull/442))
|
||||
* Add worker instances to a target pool, output as `worker_target_pool`
|
||||
* Health check for workers with Ingress controllers. Forward rules don't support differing internal/external ports, but some Ingress controllers support TCP/UDP proxy as a workaround
|
||||
* Health check for workers with Ingress controllers. Forward rules don't support differing internal/external ports, but some Ingress controllers support TCP/UDP proxy as a workaround
|
||||
* Remove Haswell minimum CPU platform requirement ([#439](https://github.com/poseidon/typhoon/pull/439))
|
||||
* Google Cloud API implements `min_cpu_platform` to mean "use exactly this CPU". Revert [#405](https://github.com/poseidon/typhoon/pull/405) added in v1.13.4.
|
||||
* Fix error creating clusters in new regions without Haswell (e.g. europe-west2) ([#438](https://github.com/poseidon/typhoon/issues/438))
|
||||
@ -537,7 +709,7 @@ Notable changes between versions.
|
||||
* Update Calico from v3.3.0 to [v3.3.1](https://docs.projectcalico.org/v3.3/releases/)
|
||||
* Disable Felix usage reporting by default ([#345](https://github.com/poseidon/typhoon/pull/345))
|
||||
* Improve flannel manifests
|
||||
* [Rename](https://github.com/poseidon/terraform-render-bootkube/commit/d045a8e6b8eccfbb9d69bb51953b5a93d23f67f7) `kube-flannel` DaemonSet to `flannel` and `kube-flannel-cfg` ConfigMap to `flannel-config`
|
||||
* [Rename](https://github.com/poseidon/terraform-render-bootkube/commit/d045a8e6b8eccfbb9d69bb51953b5a93d23f67f7) `kube-flannel` DaemonSet to `flannel` and `kube-flannel-cfg` ConfigMap to `flannel-config`
|
||||
* [Drop](https://github.com/poseidon/terraform-render-bootkube/commit/39f9afb3360ec642e5b98457c8bd07eda35b6c96) unused mounts and add a CPU resource request
|
||||
* Update CoreDNS from v1.2.4 to [v1.2.6](https://coredns.io/2018/11/05/coredns-1.2.6-release/)
|
||||
* Enable CoreDNS `loop` and `loadbalance` plugins ([#340](https://github.com/poseidon/typhoon/pull/340))
|
||||
@ -699,7 +871,7 @@ Notable changes between versions.
|
||||
* Force apiserver to stop listening on `127.0.0.1:8080`
|
||||
* Replace `kube-dns` with [CoreDNS](https://coredns.io/) ([#261](https://github.com/poseidon/typhoon/pull/261))
|
||||
* Edit the `coredns` ConfigMap to [customize](https://coredns.io/plugins/)
|
||||
* CoreDNS doesn't use a resizer. For large clusters, scaling may be required.
|
||||
* CoreDNS doesn't use a resizer. For large clusters, scaling may be required.
|
||||
|
||||
#### AWS
|
||||
|
||||
@ -744,7 +916,7 @@ Notable changes between versions.
|
||||
|
||||
* Switch `kube-apiserver` port from 443 to 6443 ([#248](https://github.com/poseidon/typhoon/pull/248))
|
||||
* Users who exposed kube-apiserver on a WAN via their router/load-balancer will need to adjust its configuration (e.g. DNAT 6443). Most apiservers are on a LAN (internal, VPN-only, etc) so if you didn't specially configure network gear for 443, no change is needed. (possible action required)
|
||||
* Fix possible deadlock when provisioning clusters larger than 10 nodes ([#244](https://github.com/poseidon/typhoon/pull/244))
|
||||
* Fix possible deadlock when provisioning clusters larger than 10 nodes ([#244](https://github.com/poseidon/typhoon/pull/244))
|
||||
|
||||
#### DigitalOcean
|
||||
|
||||
@ -812,7 +984,7 @@ Notable changes between versions.
|
||||
* Please change values stable, beta, or alpha to coreos-stable, coreos-beta, coreos-alpha (**action required!**)
|
||||
* Replace `container_linux_version` variable with `os_version`
|
||||
* Add `network_ip_autodetection_method` variable for Calico host IPv4 address detection
|
||||
* Use Calico's default "first-found" to support single NIC and bonded NIC nodes
|
||||
* Use Calico's default "first-found" to support single NIC and bonded NIC nodes
|
||||
* Allow [alternative](https://docs.projectcalico.org/v3.1/reference/node/configuration#ip-autodetection-methods) methods for multi NIC nodes, like can-reach=IP or interface=REGEX
|
||||
* Deprecate `container_linux_oem` variable
|
||||
|
||||
@ -845,7 +1017,7 @@ Notable changes between versions.
|
||||
#### Google Cloud
|
||||
|
||||
* Add support for multi-controller clusters (i.e. multi-master) ([#54](https://github.com/poseidon/typhoon/issues/54), [#190](https://github.com/poseidon/typhoon/pull/190))
|
||||
* Switch from Google Cloud network load balancer to a TCP proxy load balancer. Avoid a [bug](https://issuetracker.google.com/issues/67366622) in Google network load balancers that limited clusters to only bootstrapping one controller node.
|
||||
* Switch from Google Cloud network load balancer to a TCP proxy load balancer. Avoid a [bug](https://issuetracker.google.com/issues/67366622) in Google network load balancers that limited clusters to only bootstrapping one controller node.
|
||||
* Add TCP health check for apiserver pods on controllers. Replace kubelet check approximation.
|
||||
|
||||
#### Addons
|
||||
@ -1076,7 +1248,7 @@ Notable changes between versions.
|
||||
* Container Linux stable, beta, and alpha now provide Docker 17.09 (instead
|
||||
of 1.12)
|
||||
* Older clusters (with CLUO addon) auto-update Container Linux version to begin using Docker 17.09
|
||||
* Fix race where `etcd-member.service` could fail to resolve peers ([#69](https://github.com/poseidon/typhoon/pull/69))
|
||||
* Fix race where `etcd-member.service` could fail to resolve peers ([#69](https://github.com/poseidon/typhoon/pull/69))
|
||||
* Add optional `cluster_domain_suffix` variable (#74)
|
||||
* Use kubernetes-incubator/bootkube v0.9.1
|
||||
|
||||
|
36
README.md
36
README.md
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.17.0 (upstream)
|
||||
* Kubernetes v1.18.0 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [preemptible](https://typhoon.psdn.io/cl/google-cloud/#preemption) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
|
||||
@ -23,24 +23,36 @@ Typhoon provides a Terraform Module for each supported operating system and plat
|
||||
|
||||
| Platform | Operating System | Terraform Module | Status |
|
||||
|---------------|------------------|------------------|--------|
|
||||
| AWS | Container Linux / Flatcar Linux | [aws/container-linux/kubernetes](aws/container-linux/kubernetes) | stable |
|
||||
| AWS | Container Linux | [aws/container-linux/kubernetes](aws/container-linux/kubernetes) | stable |
|
||||
| Azure | Container Linux | [azure/container-linux/kubernetes](azure/container-linux/kubernetes) | alpha |
|
||||
| Bare-Metal | Container Linux / Flatcar Linux | [bare-metal/container-linux/kubernetes](bare-metal/container-linux/kubernetes) | stable |
|
||||
| Bare-Metal | Container Linux | [bare-metal/container-linux/kubernetes](bare-metal/container-linux/kubernetes) | stable |
|
||||
| Digital Ocean | Container Linux | [digital-ocean/container-linux/kubernetes](digital-ocean/container-linux/kubernetes) | beta |
|
||||
| Google Cloud | Container Linux | [google-cloud/container-linux/kubernetes](google-cloud/container-linux/kubernetes) | stable |
|
||||
|
||||
A preview of Typhoon for [Fedora CoreOS](https://getfedora.org/coreos/) is available for testing.
|
||||
Typhoon is available for [Fedora CoreOS](https://getfedora.org/coreos/).
|
||||
|
||||
| Platform | Operating System | Terraform Module | Status |
|
||||
|---------------|------------------|------------------|--------|
|
||||
| AWS | Fedora CoreOS | [aws/fedora-coreos/kubernetes](aws/fedora-coreos/kubernetes) | preview |
|
||||
| Bare-Metal | Fedora CoreOS | [bare-metal/fedora-coreos/kubernetes](bare-metal/fedora-coreos/kubernetes) | preview |
|
||||
| AWS | Fedora CoreOS | [aws/fedora-coreos/kubernetes](aws/fedora-coreos/kubernetes) | stable |
|
||||
| Bare-Metal | Fedora CoreOS | [bare-metal/fedora-coreos/kubernetes](bare-metal/fedora-coreos/kubernetes) | beta |
|
||||
| Google Cloud | Fedora CoreOS | [google-cloud/fedora-coreos/kubernetes](google-cloud/fedora-coreos/kubernetes) | beta |
|
||||
|
||||
Typhoon is available for [Flatcar Container Linux](https://www.flatcar-linux.org/releases/).
|
||||
|
||||
| Platform | Operating System | Terraform Module | Status |
|
||||
|---------------|------------------|------------------|--------|
|
||||
| AWS | Flatcar Linux | [aws/container-linux/kubernetes](aws/container-linux/kubernetes) | stable |
|
||||
| Azure | Flatcar Linux | [azure/container-linux/kubernetes](azure/container-linux/kubernetes) | alpha |
|
||||
| Bare-Metal | Flatcar Linux | [bare-metal/container-linux/kubernetes](bare-metal/container-linux/kubernetes) | stable |
|
||||
| Google Cloud | Flatcar Linux | [google-cloud/container-linux/kubernetes](google-cloud/container-linux/kubernetes) | alpha |
|
||||
| Digital Ocean | Flatcar Linux | [digital-ocean/container-linux/kubernetes](digital-ocean/container-linux/kubernetes) | alpha |
|
||||
|
||||
## Documentation
|
||||
|
||||
* [Docs](https://typhoon.psdn.io)
|
||||
* Architecture [concepts](https://typhoon.psdn.io/architecture/concepts/) and [operating systems](https://typhoon.psdn.io/architecture/operating-systems/)
|
||||
* Tutorials for [AWS](docs/cl/aws.md), [Azure](docs/cl/azure.md), [Bare-Metal](docs/cl/bare-metal.md), [Digital Ocean](docs/cl/digital-ocean.md), and [Google-Cloud](docs/cl/google-cloud.md)
|
||||
* Fedora CoreOS tutorials for [AWS](docs/fedora-coreos/aws.md), [Bare-Metal](docs/fedora-coreos/bare-metal.md), and [Google Cloud](docs/fedora-coreos/google-cloud.md)
|
||||
* Flatcar Linux tutorials for [AWS](docs/cl/aws.md), [Azure](docs/cl/azure.md), [Bare-Metal](docs/cl/bare-metal.md), [DigitalOcean](docs/cl/digital-ocean.md), and [Google Cloud](docs/cl/google-cloud.md)
|
||||
|
||||
## Usage
|
||||
|
||||
@ -48,7 +60,7 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo
|
||||
|
||||
```tf
|
||||
module "yavin" {
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.17.0"
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.18.0"
|
||||
|
||||
# Google Cloud
|
||||
cluster_name = "yavin"
|
||||
@ -58,7 +70,7 @@ module "yavin" {
|
||||
|
||||
# configuration
|
||||
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
|
||||
|
||||
|
||||
# optional
|
||||
worker_count = 2
|
||||
worker_preemptible = true
|
||||
@ -87,9 +99,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
|
||||
$ export KUBECONFIG=/home/user/.kube/configs/yavin-config
|
||||
$ kubectl get nodes
|
||||
NAME ROLES STATUS AGE VERSION
|
||||
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.17.0
|
||||
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.17.0
|
||||
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.17.0
|
||||
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.18.0
|
||||
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.18.0
|
||||
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.18.0
|
||||
```
|
||||
|
||||
List the pods.
|
||||
|
@ -1,4 +0,0 @@
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: reboot-coordinator
|
@ -1,12 +0,0 @@
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: ClusterRoleBinding
|
||||
metadata:
|
||||
name: reboot-coordinator
|
||||
roleRef:
|
||||
apiGroup: rbac.authorization.k8s.io
|
||||
kind: ClusterRole
|
||||
name: reboot-coordinator
|
||||
subjects:
|
||||
- kind: ServiceAccount
|
||||
namespace: reboot-coordinator
|
||||
name: default
|
@ -1,45 +0,0 @@
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: ClusterRole
|
||||
metadata:
|
||||
name: reboot-coordinator
|
||||
rules:
|
||||
- apiGroups:
|
||||
- ""
|
||||
resources:
|
||||
- nodes
|
||||
verbs:
|
||||
- get
|
||||
- list
|
||||
- watch
|
||||
- update
|
||||
- apiGroups:
|
||||
- ""
|
||||
resources:
|
||||
- configmaps
|
||||
verbs:
|
||||
- create
|
||||
- get
|
||||
- update
|
||||
- list
|
||||
- watch
|
||||
- apiGroups:
|
||||
- ""
|
||||
resources:
|
||||
- events
|
||||
verbs:
|
||||
- create
|
||||
- watch
|
||||
- apiGroups:
|
||||
- ""
|
||||
resources:
|
||||
- pods
|
||||
verbs:
|
||||
- get
|
||||
- list
|
||||
- delete
|
||||
- apiGroups:
|
||||
- "extensions"
|
||||
resources:
|
||||
- daemonsets
|
||||
verbs:
|
||||
- get
|
@ -1,68 +0,0 @@
|
||||
apiVersion: apps/v1
|
||||
kind: DaemonSet
|
||||
metadata:
|
||||
name: container-linux-update-agent
|
||||
namespace: reboot-coordinator
|
||||
spec:
|
||||
updateStrategy:
|
||||
type: RollingUpdate
|
||||
rollingUpdate:
|
||||
maxUnavailable: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
name: container-linux-update-agent
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
name: container-linux-update-agent
|
||||
annotations:
|
||||
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
|
||||
spec:
|
||||
tolerations:
|
||||
- key: node-role.kubernetes.io/master
|
||||
operator: Exists
|
||||
effect: NoSchedule
|
||||
containers:
|
||||
- name: update-agent
|
||||
image: quay.io/coreos/container-linux-update-operator:v0.7.0
|
||||
command:
|
||||
- "/bin/update-agent"
|
||||
env:
|
||||
# read by update-agent as the node name to manage reboots for
|
||||
- name: UPDATE_AGENT_NODE
|
||||
valueFrom:
|
||||
fieldRef:
|
||||
fieldPath: spec.nodeName
|
||||
- name: POD_NAMESPACE
|
||||
valueFrom:
|
||||
fieldRef:
|
||||
fieldPath: metadata.namespace
|
||||
resources:
|
||||
requests:
|
||||
cpu: 10m
|
||||
memory: 20Mi
|
||||
limits:
|
||||
cpu: 20m
|
||||
memory: 40Mi
|
||||
volumeMounts:
|
||||
- mountPath: /var/run/dbus
|
||||
name: var-run-dbus
|
||||
- mountPath: /etc/coreos
|
||||
name: etc-coreos
|
||||
- mountPath: /usr/share/coreos
|
||||
name: usr-share-coreos
|
||||
- mountPath: /etc/os-release
|
||||
name: etc-os-release
|
||||
volumes:
|
||||
- name: var-run-dbus
|
||||
hostPath:
|
||||
path: /var/run/dbus
|
||||
- name: etc-coreos
|
||||
hostPath:
|
||||
path: /etc/coreos
|
||||
- name: usr-share-coreos
|
||||
hostPath:
|
||||
path: /usr/share/coreos
|
||||
- name: etc-os-release
|
||||
hostPath:
|
||||
path: /etc/os-release
|
@ -1,39 +0,0 @@
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: container-linux-update-operator
|
||||
namespace: reboot-coordinator
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
name: container-linux-update-operator
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
name: container-linux-update-operator
|
||||
annotations:
|
||||
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
|
||||
spec:
|
||||
tolerations:
|
||||
- key: node-role.kubernetes.io/master
|
||||
operator: Exists
|
||||
effect: NoSchedule
|
||||
containers:
|
||||
- name: update-operator
|
||||
image: quay.io/coreos/container-linux-update-operator:v0.7.0
|
||||
command:
|
||||
- "/bin/update-operator"
|
||||
env:
|
||||
- name: POD_NAMESPACE
|
||||
valueFrom:
|
||||
fieldRef:
|
||||
fieldPath: metadata.namespace
|
||||
resources:
|
||||
requests:
|
||||
cpu: 10m
|
||||
memory: 20Mi
|
||||
limits:
|
||||
cpu: 20m
|
||||
memory: 40Mi
|
||||
|
@ -21,7 +21,7 @@ data:
|
||||
"links": [
|
||||
|
||||
],
|
||||
"refresh": "",
|
||||
"refresh": "10s",
|
||||
"rows": [
|
||||
{
|
||||
"collapse": false,
|
||||
@ -558,15 +558,15 @@ data:
|
||||
},
|
||||
"id": 8,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -649,15 +649,15 @@ data:
|
||||
},
|
||||
"id": 9,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -753,15 +753,15 @@ data:
|
||||
},
|
||||
"id": 10,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -857,15 +857,15 @@ data:
|
||||
},
|
||||
"id": 11,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -955,15 +955,15 @@ data:
|
||||
},
|
||||
"id": 12,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -1066,17 +1066,17 @@ data:
|
||||
},
|
||||
"id": 13,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"hideEmpty": "true",
|
||||
"hideZero": "true",
|
||||
"current": true,
|
||||
"hideEmpty": true,
|
||||
"hideZero": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -1159,17 +1159,17 @@ data:
|
||||
},
|
||||
"id": 14,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"hideEmpty": "true",
|
||||
"hideZero": "true",
|
||||
"current": true,
|
||||
"hideEmpty": true,
|
||||
"hideZero": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -1265,17 +1265,17 @@ data:
|
||||
},
|
||||
"id": 15,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"hideEmpty": "true",
|
||||
"hideZero": "true",
|
||||
"current": true,
|
||||
"hideEmpty": true,
|
||||
"hideZero": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -1371,15 +1371,15 @@ data:
|
||||
},
|
||||
"id": 16,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -1462,15 +1462,15 @@ data:
|
||||
},
|
||||
"id": 17,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -1567,15 +1567,15 @@ data:
|
||||
},
|
||||
"id": 18,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -1658,15 +1658,15 @@ data:
|
||||
},
|
||||
"id": 19,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -1762,15 +1762,15 @@ data:
|
||||
},
|
||||
"id": 20,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -1991,15 +1991,15 @@ data:
|
||||
},
|
||||
"id": 22,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -2373,8 +2373,8 @@ data:
|
||||
"list": [
|
||||
{
|
||||
"current": {
|
||||
"text": "Prometheus",
|
||||
"value": "Prometheus"
|
||||
"text": "default",
|
||||
"value": "default"
|
||||
},
|
||||
"hide": 0,
|
||||
"label": null,
|
||||
@ -2427,7 +2427,7 @@ data:
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kubelet_runtime_operations{cluster=\"$cluster\", job=\"kubelet\"}, instance)",
|
||||
"query": "label_values(kubelet_runtime_operations_total{cluster=\"$cluster\", job=\"kubelet\"}, instance)",
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
@ -2496,7 +2496,7 @@ data:
|
||||
"links": [
|
||||
|
||||
],
|
||||
"refresh": "",
|
||||
"refresh": "10s",
|
||||
"rows": [
|
||||
{
|
||||
"collapse": false,
|
||||
@ -2691,15 +2691,15 @@ data:
|
||||
},
|
||||
"id": 4,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -2886,15 +2886,15 @@ data:
|
||||
},
|
||||
"id": 6,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -3206,15 +3206,15 @@ data:
|
||||
},
|
||||
"id": 9,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -3588,8 +3588,8 @@ data:
|
||||
"list": [
|
||||
{
|
||||
"current": {
|
||||
"text": "Prometheus",
|
||||
"value": "Prometheus"
|
||||
"text": "default",
|
||||
"value": "default"
|
||||
},
|
||||
"hide": 0,
|
||||
"label": null,
|
||||
|
@ -2458,8 +2458,8 @@ data:
|
||||
"list": [
|
||||
{
|
||||
"current": {
|
||||
"text": "Prometheus",
|
||||
"value": "Prometheus"
|
||||
"text": "default",
|
||||
"value": "default"
|
||||
},
|
||||
"hide": 0,
|
||||
"label": null,
|
||||
@ -2508,7 +2508,7 @@ data:
|
||||
"text": "5m",
|
||||
"value": "5m"
|
||||
},
|
||||
"datasource": "prometheus",
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": null,
|
||||
@ -2533,6 +2533,33 @@ data:
|
||||
"tagsQuery": "",
|
||||
"type": "interval",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "",
|
||||
"value": ""
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": null,
|
||||
"multi": false,
|
||||
"name": "cluster",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(node_cpu_seconds_total, cluster)",
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
}
|
||||
]
|
||||
},
|
||||
@ -2586,6 +2613,354 @@ data:
|
||||
],
|
||||
"refresh": "10s",
|
||||
"rows": [
|
||||
{
|
||||
"collapse": false,
|
||||
"height": "100px",
|
||||
"panels": [
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"format": "percentunit",
|
||||
"id": 1,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null as zero",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 3,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}) / sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"})",
|
||||
"format": "time_series",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": "70,80",
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "CPU Utilisation (from requests)",
|
||||
"tooltip": {
|
||||
"shared": false,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "singlestat",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": false
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"format": "percentunit",
|
||||
"id": 2,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null as zero",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 3,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster=\"$cluster\", namespace=\"$namespace\"}) / sum(kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\"})",
|
||||
"format": "time_series",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": "70,80",
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "CPU Utilisation (from limits)",
|
||||
"tooltip": {
|
||||
"shared": false,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "singlestat",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": false
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"format": "percentunit",
|
||||
"id": 3,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null as zero",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 3,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) / sum(kube_pod_container_resource_requests_memory_bytes{namespace=\"$namespace\"})",
|
||||
"format": "time_series",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": "70,80",
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "Memory Utilization (from requests)",
|
||||
"tooltip": {
|
||||
"shared": false,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "singlestat",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": false
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"format": "percentunit",
|
||||
"id": 4,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null as zero",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 3,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\",container!=\"\"}) / sum(kube_pod_container_resource_limits_memory_bytes{namespace=\"$namespace\"})",
|
||||
"format": "time_series",
|
||||
"instant": true,
|
||||
"intervalFactor": 2,
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": "70,80",
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "Memory Utilisation (from limits)",
|
||||
"tooltip": {
|
||||
"shared": false,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "singlestat",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": false
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"repeat": null,
|
||||
"repeatIteration": null,
|
||||
"repeatRowId": null,
|
||||
"showTitle": false,
|
||||
"title": "Headlines",
|
||||
"titleSize": "h6"
|
||||
},
|
||||
{
|
||||
"collapse": false,
|
||||
"height": "250px",
|
||||
@ -2599,7 +2974,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 10,
|
||||
"id": 1,
|
||||
"id": 5,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -2620,7 +2995,26 @@ data:
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [
|
||||
|
||||
{
|
||||
"alias": "quota - requests",
|
||||
"color": "#F2495C",
|
||||
"dashes": true,
|
||||
"fill": 0,
|
||||
"hideTooltip": true,
|
||||
"legend": false,
|
||||
"linewidth": 2,
|
||||
"stack": false
|
||||
},
|
||||
{
|
||||
"alias": "quota - limits",
|
||||
"color": "#FF9830",
|
||||
"dashes": true,
|
||||
"fill": 0,
|
||||
"hideTooltip": true,
|
||||
"legend": false,
|
||||
"linewidth": 2,
|
||||
"stack": false
|
||||
}
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 12,
|
||||
@ -2634,6 +3028,22 @@ data:
|
||||
"legendFormat": "{{pod}}",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "scalar(kube_resourcequota{cluster=\"$cluster\", namespace=\"$namespace\", type=\"hard\",resource=\"requests.cpu\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "quota - requests",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "scalar(kube_resourcequota{cluster=\"$cluster\", namespace=\"$namespace\", type=\"hard\",resource=\"limits.cpu\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "quota - limits",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
@ -2697,7 +3107,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"id": 2,
|
||||
"id": 6,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -2964,7 +3374,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 10,
|
||||
"id": 3,
|
||||
"id": 7,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -2985,7 +3395,26 @@ data:
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [
|
||||
|
||||
{
|
||||
"alias": "quota - requests",
|
||||
"color": "#F2495C",
|
||||
"dashes": true,
|
||||
"fill": 0,
|
||||
"hideTooltip": true,
|
||||
"legend": false,
|
||||
"linewidth": 2,
|
||||
"stack": false
|
||||
},
|
||||
{
|
||||
"alias": "quota - limits",
|
||||
"color": "#FF9830",
|
||||
"dashes": true,
|
||||
"fill": 0,
|
||||
"hideTooltip": true,
|
||||
"legend": false,
|
||||
"linewidth": 2,
|
||||
"stack": false
|
||||
}
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 12,
|
||||
@ -2999,6 +3428,22 @@ data:
|
||||
"legendFormat": "{{pod}}",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "scalar(kube_resourcequota{cluster=\"$cluster\", namespace=\"$namespace\", type=\"hard\",resource=\"requests.memory\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "quota - requests",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "scalar(kube_resourcequota{cluster=\"$cluster\", namespace=\"$namespace\", type=\"hard\",resource=\"limits.memory\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "quota - limits",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
@ -3062,7 +3507,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"id": 4,
|
||||
"id": 8,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -3410,7 +3855,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"id": 5,
|
||||
"id": 9,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -3704,7 +4149,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 10,
|
||||
"id": 6,
|
||||
"id": 10,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -3802,7 +4247,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 10,
|
||||
"id": 7,
|
||||
"id": 11,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -3900,7 +4345,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 10,
|
||||
"id": 8,
|
||||
"id": 12,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -3998,7 +4443,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 10,
|
||||
"id": 9,
|
||||
"id": 13,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -4096,7 +4541,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 10,
|
||||
"id": 10,
|
||||
"id": 14,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -4194,7 +4639,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 10,
|
||||
"id": 11,
|
||||
"id": 15,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -4289,8 +4734,8 @@ data:
|
||||
"list": [
|
||||
{
|
||||
"current": {
|
||||
"text": "Prometheus",
|
||||
"value": "Prometheus"
|
||||
"text": "default",
|
||||
"value": "default"
|
||||
},
|
||||
"hide": 0,
|
||||
"label": null,
|
||||
@ -4303,60 +4748,6 @@ data:
|
||||
"regex": "",
|
||||
"type": "datasource"
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "prod",
|
||||
"value": "prod"
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": "cluster",
|
||||
"multi": false,
|
||||
"name": "cluster",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_info, cluster)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 2,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "prod",
|
||||
"value": "prod"
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": "namespace",
|
||||
"multi": false,
|
||||
"name": "namespace",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_info{cluster=\"$cluster\"}, namespace)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 2,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"auto": false,
|
||||
@ -4366,7 +4757,7 @@ data:
|
||||
"text": "5m",
|
||||
"value": "5m"
|
||||
},
|
||||
"datasource": "prometheus",
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": null,
|
||||
@ -4391,6 +4782,60 @@ data:
|
||||
"tagsQuery": "",
|
||||
"type": "interval",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "",
|
||||
"value": ""
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": null,
|
||||
"multi": false,
|
||||
"name": "cluster",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_info, cluster)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "",
|
||||
"value": ""
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": null,
|
||||
"multi": false,
|
||||
"name": "namespace",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_info{cluster=\"$cluster\"}, namespace)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
}
|
||||
]
|
||||
},
|
||||
@ -5265,8 +5710,8 @@ data:
|
||||
"list": [
|
||||
{
|
||||
"current": {
|
||||
"text": "Prometheus",
|
||||
"value": "Prometheus"
|
||||
"text": "default",
|
||||
"value": "default"
|
||||
},
|
||||
"hide": 0,
|
||||
"label": null,
|
||||
@ -5281,14 +5726,49 @@ data:
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"auto": false,
|
||||
"auto_count": 30,
|
||||
"auto_min": "10s",
|
||||
"current": {
|
||||
"text": "prod",
|
||||
"value": "prod"
|
||||
"text": "5m",
|
||||
"value": "5m"
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": "cluster",
|
||||
"label": null,
|
||||
"multi": false,
|
||||
"name": "interval",
|
||||
"options": [
|
||||
{
|
||||
"selected": true,
|
||||
"text": "4h",
|
||||
"value": "4h"
|
||||
}
|
||||
],
|
||||
"query": "4h",
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"skipUrlSync": false,
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "interval",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "",
|
||||
"value": ""
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": null,
|
||||
"multi": false,
|
||||
"name": "cluster",
|
||||
"options": [
|
||||
@ -5297,7 +5777,7 @@ data:
|
||||
"query": "label_values(kube_pod_info, cluster)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 2,
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
@ -5309,13 +5789,13 @@ data:
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "prod",
|
||||
"value": "prod"
|
||||
"text": "",
|
||||
"value": ""
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": "node",
|
||||
"label": null,
|
||||
"multi": false,
|
||||
"name": "node",
|
||||
"options": [
|
||||
@ -5324,7 +5804,7 @@ data:
|
||||
"query": "label_values(kube_pod_info{cluster=\"$cluster\"}, node)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 2,
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
|
@ -50,7 +50,24 @@ data:
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [
|
||||
|
||||
{
|
||||
"alias": "requests",
|
||||
"color": "#F2495C",
|
||||
"fill": 0,
|
||||
"hideTooltip": true,
|
||||
"legend": true,
|
||||
"linewidth": 2,
|
||||
"stack": false
|
||||
},
|
||||
{
|
||||
"alias": "limits",
|
||||
"color": "#FF9830",
|
||||
"fill": 0,
|
||||
"hideTooltip": true,
|
||||
"legend": true,
|
||||
"linewidth": 2,
|
||||
"stack": false
|
||||
}
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 12,
|
||||
@ -64,6 +81,22 @@ data:
|
||||
"legendFormat": "{{container}}",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"})\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "requests",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n kube_pod_container_resource_limits_cpu_cores{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"})\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "limits",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
@ -126,8 +159,113 @@ data:
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"fill": 10,
|
||||
"id": 2,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": true,
|
||||
"max": true,
|
||||
"min": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 0,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null as zero",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 12,
|
||||
"stack": true,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(increase(container_cpu_cfs_throttled_periods_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", cluster=\"$cluster\"}[5m])) by (container) /sum(increase(container_cpu_cfs_periods_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", cluster=\"$cluster\"}[5m])) by (container)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{container}}",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
{
|
||||
"colorMode": "critical",
|
||||
"fill": true,
|
||||
"line": true,
|
||||
"op": "gt",
|
||||
"value": 1,
|
||||
"yaxis": "left"
|
||||
}
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "CPU Throttling",
|
||||
"tooltip": {
|
||||
"shared": false,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "percentunit",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": 1,
|
||||
"min": 0,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": false
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"repeat": null,
|
||||
"repeatIteration": null,
|
||||
"repeatRowId": null,
|
||||
"showTitle": true,
|
||||
"title": "CPU Throttling",
|
||||
"titleSize": "h6"
|
||||
},
|
||||
{
|
||||
"collapse": false,
|
||||
"height": "250px",
|
||||
"panels": [
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"id": 3,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -394,7 +532,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 10,
|
||||
"id": 3,
|
||||
"id": 4,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -415,7 +553,26 @@ data:
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [
|
||||
|
||||
{
|
||||
"alias": "requests",
|
||||
"color": "#F2495C",
|
||||
"dashes": true,
|
||||
"fill": 0,
|
||||
"hideTooltip": true,
|
||||
"legend": false,
|
||||
"linewidth": 2,
|
||||
"stack": false
|
||||
},
|
||||
{
|
||||
"alias": "limits",
|
||||
"color": "#FF9830",
|
||||
"dashes": true,
|
||||
"fill": 0,
|
||||
"hideTooltip": true,
|
||||
"legend": false,
|
||||
"linewidth": 2,
|
||||
"stack": false
|
||||
}
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 12,
|
||||
@ -423,26 +580,26 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(container_memory_rss{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container!=\"\"}) by (container)",
|
||||
"expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container!=\"\"}) by (container)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{container}} (RSS)",
|
||||
"legendFormat": "{{container}}",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(container_memory_cache{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container!=\"\"}) by (container)",
|
||||
"expr": "sum(\n kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"})\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{container}} (Cache)",
|
||||
"legendFormat": "requests",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "sum(container_memory_swap{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container!=\"\"}) by (container)",
|
||||
"expr": "sum(\n kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"})\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{container}} (Swap)",
|
||||
"legendFormat": "limits",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
}
|
||||
@ -508,7 +665,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"id": 4,
|
||||
"id": 5,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -856,7 +1013,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 10,
|
||||
"id": 5,
|
||||
"id": 6,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -954,7 +1111,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 10,
|
||||
"id": 6,
|
||||
"id": 7,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -1052,7 +1209,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 10,
|
||||
"id": 7,
|
||||
"id": 8,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -1150,7 +1307,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 10,
|
||||
"id": 8,
|
||||
"id": 9,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -1248,7 +1405,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 10,
|
||||
"id": 9,
|
||||
"id": 10,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -1346,7 +1503,7 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 10,
|
||||
"id": 10,
|
||||
"id": 11,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
@ -1441,8 +1598,8 @@ data:
|
||||
"list": [
|
||||
{
|
||||
"current": {
|
||||
"text": "Prometheus",
|
||||
"value": "Prometheus"
|
||||
"text": "default",
|
||||
"value": "default"
|
||||
},
|
||||
"hide": 0,
|
||||
"label": null,
|
||||
@ -1455,87 +1612,6 @@ data:
|
||||
"regex": "",
|
||||
"type": "datasource"
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "prod",
|
||||
"value": "prod"
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": "cluster",
|
||||
"multi": false,
|
||||
"name": "cluster",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_info, cluster)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 2,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "prod",
|
||||
"value": "prod"
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": "namespace",
|
||||
"multi": false,
|
||||
"name": "namespace",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_info{cluster=\"$cluster\"}, namespace)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 2,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "prod",
|
||||
"value": "prod"
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": "pod",
|
||||
"multi": false,
|
||||
"name": "pod",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_info{cluster=\"$cluster\", namespace=\"$namespace\"}, pod)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 2,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"auto": false,
|
||||
@ -1545,7 +1621,7 @@ data:
|
||||
"text": "5m",
|
||||
"value": "5m"
|
||||
},
|
||||
"datasource": "prometheus",
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": null,
|
||||
@ -1570,6 +1646,87 @@ data:
|
||||
"tagsQuery": "",
|
||||
"type": "interval",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "",
|
||||
"value": ""
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": null,
|
||||
"multi": false,
|
||||
"name": "cluster",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_info, cluster)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "",
|
||||
"value": ""
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": null,
|
||||
"multi": false,
|
||||
"name": "namespace",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_info{cluster=\"$cluster\"}, namespace)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "",
|
||||
"value": ""
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": null,
|
||||
"multi": false,
|
||||
"name": "pod",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_info{cluster=\"$cluster\", namespace=\"$namespace\"}, pod)",
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
}
|
||||
]
|
||||
},
|
||||
@ -3441,8 +3598,8 @@ data:
|
||||
"list": [
|
||||
{
|
||||
"current": {
|
||||
"text": "Prometheus",
|
||||
"value": "Prometheus"
|
||||
"text": "default",
|
||||
"value": "default"
|
||||
},
|
||||
"hide": 0,
|
||||
"label": null,
|
||||
@ -3455,114 +3612,6 @@ data:
|
||||
"regex": "",
|
||||
"type": "datasource"
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "prod",
|
||||
"value": "prod"
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": "cluster",
|
||||
"multi": false,
|
||||
"name": "cluster",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_info, cluster)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 2,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "prod",
|
||||
"value": "prod"
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": "namespace",
|
||||
"multi": false,
|
||||
"name": "namespace",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_info{cluster=\"$cluster\"}, namespace)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 2,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "prod",
|
||||
"value": "prod"
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": "workload",
|
||||
"multi": false,
|
||||
"name": "workload",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}, workload)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 2,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "prod",
|
||||
"value": "prod"
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": "type",
|
||||
"multi": false,
|
||||
"name": "type",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\"}, workload_type)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 2,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"auto": false,
|
||||
@ -3572,7 +3621,7 @@ data:
|
||||
"text": "5m",
|
||||
"value": "5m"
|
||||
},
|
||||
"datasource": "prometheus",
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": null,
|
||||
@ -3597,6 +3646,114 @@ data:
|
||||
"tagsQuery": "",
|
||||
"type": "interval",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "",
|
||||
"value": ""
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": null,
|
||||
"multi": false,
|
||||
"name": "cluster",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_info, cluster)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "",
|
||||
"value": ""
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": null,
|
||||
"multi": false,
|
||||
"name": "namespace",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_info{cluster=\"$cluster\"}, namespace)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "",
|
||||
"value": ""
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": null,
|
||||
"multi": false,
|
||||
"name": "workload",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\"}, workload)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "",
|
||||
"value": ""
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": null,
|
||||
"multi": false,
|
||||
"name": "type",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(mixin_pod_workload{cluster=\"$cluster\", namespace=\"$namespace\", workload=\"$workload\"}, workload_type)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
}
|
||||
]
|
||||
},
|
||||
@ -3684,7 +3841,26 @@ data:
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [
|
||||
|
||||
{
|
||||
"alias": "quota - requests",
|
||||
"color": "#F2495C",
|
||||
"dashes": true,
|
||||
"fill": 0,
|
||||
"hideTooltip": true,
|
||||
"legend": false,
|
||||
"linewidth": 2,
|
||||
"stack": false
|
||||
},
|
||||
{
|
||||
"alias": "quota - limits",
|
||||
"color": "#FF9830",
|
||||
"dashes": true,
|
||||
"fill": 0,
|
||||
"hideTooltip": true,
|
||||
"legend": false,
|
||||
"linewidth": 2,
|
||||
"stack": false
|
||||
}
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 12,
|
||||
@ -3698,6 +3874,22 @@ data:
|
||||
"legendFormat": "{{workload}} - {{workload_type}}",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "scalar(kube_resourcequota{cluster=\"$cluster\", namespace=\"$namespace\", type=\"hard\",resource=\"requests.cpu\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "quota - requests",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "scalar(kube_resourcequota{cluster=\"$cluster\", namespace=\"$namespace\", type=\"hard\",resource=\"limits.cpu\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "quota - limits",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
@ -4094,7 +4286,26 @@ data:
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [
|
||||
|
||||
{
|
||||
"alias": "quota - requests",
|
||||
"color": "#F2495C",
|
||||
"dashes": true,
|
||||
"fill": 0,
|
||||
"hideTooltip": true,
|
||||
"legend": false,
|
||||
"linewidth": 2,
|
||||
"stack": false
|
||||
},
|
||||
{
|
||||
"alias": "quota - limits",
|
||||
"color": "#FF9830",
|
||||
"dashes": true,
|
||||
"fill": 0,
|
||||
"hideTooltip": true,
|
||||
"legend": false,
|
||||
"linewidth": 2,
|
||||
"stack": false
|
||||
}
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 12,
|
||||
@ -4108,6 +4319,22 @@ data:
|
||||
"legendFormat": "{{workload}} - {{workload_type}}",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "scalar(kube_resourcequota{cluster=\"$cluster\", namespace=\"$namespace\", type=\"hard\",resource=\"requests.memory\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "quota - requests",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
},
|
||||
{
|
||||
"expr": "scalar(kube_resourcequota{cluster=\"$cluster\", namespace=\"$namespace\", type=\"hard\",resource=\"limits.memory\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "quota - limits",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
@ -5576,8 +5803,8 @@ data:
|
||||
"list": [
|
||||
{
|
||||
"current": {
|
||||
"text": "Prometheus",
|
||||
"value": "Prometheus"
|
||||
"text": "default",
|
||||
"value": "default"
|
||||
},
|
||||
"hide": 0,
|
||||
"label": null,
|
||||
@ -5590,60 +5817,6 @@ data:
|
||||
"regex": "",
|
||||
"type": "datasource"
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "prod",
|
||||
"value": "prod"
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": "cluster",
|
||||
"multi": false,
|
||||
"name": "cluster",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_info, cluster)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 2,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "prod",
|
||||
"value": "prod"
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": "namespace",
|
||||
"multi": false,
|
||||
"name": "namespace",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_info{cluster=\"$cluster\"}, namespace)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 2,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"auto": false,
|
||||
@ -5653,7 +5826,7 @@ data:
|
||||
"text": "5m",
|
||||
"value": "5m"
|
||||
},
|
||||
"datasource": "prometheus",
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": null,
|
||||
@ -5706,6 +5879,60 @@ data:
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "",
|
||||
"value": ""
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": null,
|
||||
"multi": false,
|
||||
"name": "cluster",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_info, cluster)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "",
|
||||
"value": ""
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": null,
|
||||
"multi": false,
|
||||
"name": "namespace",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_info{cluster=\"$cluster\"}, namespace)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
|
@ -21,7 +21,7 @@ data:
|
||||
"links": [
|
||||
|
||||
],
|
||||
"refresh": "",
|
||||
"refresh": "10s",
|
||||
"rows": [
|
||||
{
|
||||
"collapse": false,
|
||||
@ -88,7 +88,7 @@ data:
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(up{job=\"apiserver\"})",
|
||||
"expr": "sum(up{job=\"apiserver\", cluster=\"$cluster\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
@ -155,28 +155,28 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(rate(apiserver_request_total{job=\"apiserver\", instance=~\"$instance\",code=~\"2..\"}[5m]))",
|
||||
"expr": "sum(rate(apiserver_request_total{job=\"apiserver\", instance=~\"$instance\",code=~\"2..\", cluster=\"$cluster\"}[5m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "2xx",
|
||||
"refId": "A"
|
||||
},
|
||||
{
|
||||
"expr": "sum(rate(apiserver_request_total{job=\"apiserver\", instance=~\"$instance\",code=~\"3..\"}[5m]))",
|
||||
"expr": "sum(rate(apiserver_request_total{job=\"apiserver\", instance=~\"$instance\",code=~\"3..\", cluster=\"$cluster\"}[5m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "3xx",
|
||||
"refId": "B"
|
||||
},
|
||||
{
|
||||
"expr": "sum(rate(apiserver_request_total{job=\"apiserver\", instance=~\"$instance\",code=~\"4..\"}[5m]))",
|
||||
"expr": "sum(rate(apiserver_request_total{job=\"apiserver\", instance=~\"$instance\",code=~\"4..\", cluster=\"$cluster\"}[5m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "4xx",
|
||||
"refId": "C"
|
||||
},
|
||||
{
|
||||
"expr": "sum(rate(apiserver_request_total{job=\"apiserver\", instance=~\"$instance\",code=~\"5..\"}[5m]))",
|
||||
"expr": "sum(rate(apiserver_request_total{job=\"apiserver\", instance=~\"$instance\",code=~\"5..\", cluster=\"$cluster\"}[5m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "5xx",
|
||||
@ -237,15 +237,15 @@ data:
|
||||
},
|
||||
"id": 4,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -267,7 +267,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "histogram_quantile(0.99, sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\", instance=~\"$instance\"}[5m])) by (verb, le))",
|
||||
"expr": "histogram_quantile(0.99, sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\", instance=~\"$instance\", verb!=\"WATCH\", cluster=\"$cluster\"}[5m])) by (verb, le))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{verb}}",
|
||||
@ -371,7 +371,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(rate(workqueue_adds_total{job=\"apiserver\", instance=~\"$instance\"}[5m])) by (instance, name)",
|
||||
"expr": "sum(rate(workqueue_adds_total{job=\"apiserver\", instance=~\"$instance\", cluster=\"$cluster\"}[5m])) by (instance, name)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{instance}} {{name}}",
|
||||
@ -462,7 +462,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(rate(workqueue_depth{job=\"apiserver\", instance=~\"$instance\"}[5m])) by (instance, name)",
|
||||
"expr": "sum(rate(workqueue_depth{job=\"apiserver\", instance=~\"$instance\", cluster=\"$cluster\"}[5m])) by (instance, name)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{instance}} {{name}}",
|
||||
@ -523,15 +523,15 @@ data:
|
||||
},
|
||||
"id": 7,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -553,7 +553,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "histogram_quantile(0.99, sum(rate(workqueue_queue_duration_seconds_bucket{job=\"apiserver\", instance=~\"$instance\"}[5m])) by (instance, name, le))",
|
||||
"expr": "histogram_quantile(0.99, sum(rate(workqueue_queue_duration_seconds_bucket{job=\"apiserver\", instance=~\"$instance\", cluster=\"$cluster\"}[5m])) by (instance, name, le))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{instance}} {{name}}",
|
||||
@ -657,7 +657,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "etcd_helper_cache_entry_total{job=\"apiserver\", instance=~\"$instance\"}",
|
||||
"expr": "etcd_helper_cache_entry_total{job=\"apiserver\", instance=~\"$instance\", cluster=\"$cluster\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{instance}}",
|
||||
@ -748,14 +748,14 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(rate(etcd_helper_cache_hit_total{job=\"apiserver\",instance=~\"$instance\"}[5m])) by (intance)",
|
||||
"expr": "sum(rate(etcd_helper_cache_hit_total{job=\"apiserver\",instance=~\"$instance\", cluster=\"$cluster\"}[5m])) by (instance)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{instance}} hit",
|
||||
"refId": "A"
|
||||
},
|
||||
{
|
||||
"expr": "sum(rate(etcd_helper_cache_miss_total{job=\"apiserver\",instance=~\"$instance\"}[5m])) by (instance)",
|
||||
"expr": "sum(rate(etcd_helper_cache_miss_total{job=\"apiserver\",instance=~\"$instance\", cluster=\"$cluster\"}[5m])) by (instance)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{instance}} miss",
|
||||
@ -846,14 +846,14 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "histogram_quantile(0.99,sum(rate(etcd_request_cache_get_duration_seconds_bucket{job=\"apiserver\",instance=~\"$instance\"}[5m])) by (instance, le))",
|
||||
"expr": "histogram_quantile(0.99,sum(rate(etcd_request_cache_get_duration_seconds_bucket{job=\"apiserver\",instance=~\"$instance\", cluster=\"$cluster\"}[5m])) by (instance, le))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{instance}} get",
|
||||
"refId": "A"
|
||||
},
|
||||
{
|
||||
"expr": "histogram_quantile(0.99,sum(rate(etcd_request_cache_add_duration_seconds_bucket{job=\"apiserver\",instance=~\"$instance\"}[5m])) by (instance, le))",
|
||||
"expr": "histogram_quantile(0.99,sum(rate(etcd_request_cache_add_duration_seconds_bucket{job=\"apiserver\",instance=~\"$instance\", cluster=\"$cluster\"}[5m])) by (instance, le))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{instance}} miss",
|
||||
@ -957,7 +957,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "process_resident_memory_bytes{job=\"apiserver\",instance=~\"$instance\"}",
|
||||
"expr": "process_resident_memory_bytes{job=\"apiserver\",instance=~\"$instance\", cluster=\"$cluster\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{instance}}",
|
||||
@ -1048,7 +1048,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "rate(process_cpu_seconds_total{job=\"apiserver\",instance=~\"$instance\"}[5m])",
|
||||
"expr": "rate(process_cpu_seconds_total{job=\"apiserver\",instance=~\"$instance\", cluster=\"$cluster\"}[5m])",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{instance}}",
|
||||
@ -1139,7 +1139,7 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "go_goroutines{job=\"apiserver\",instance=~\"$instance\"}",
|
||||
"expr": "go_goroutines{job=\"apiserver\",instance=~\"$instance\", cluster=\"$cluster\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{instance}}",
|
||||
@ -1205,8 +1205,8 @@ data:
|
||||
"list": [
|
||||
{
|
||||
"current": {
|
||||
"text": "Prometheus",
|
||||
"value": "Prometheus"
|
||||
"text": "default",
|
||||
"value": "default"
|
||||
},
|
||||
"hide": 0,
|
||||
"label": null,
|
||||
@ -1219,6 +1219,33 @@ data:
|
||||
"regex": "",
|
||||
"type": "datasource"
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"text": "prod",
|
||||
"value": "prod"
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": null,
|
||||
"multi": false,
|
||||
"name": "cluster",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(apiserver_request_total, cluster)",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
@ -1233,7 +1260,7 @@ data:
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(apiserver_request_total{job=\"apiserver\"}, instance)",
|
||||
"query": "label_values(apiserver_request_total{job=\"apiserver\", cluster=\"$cluster\"}, instance)",
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
@ -1302,7 +1329,7 @@ data:
|
||||
"links": [
|
||||
|
||||
],
|
||||
"refresh": "",
|
||||
"refresh": "10s",
|
||||
"rows": [
|
||||
{
|
||||
"collapse": false,
|
||||
@ -1406,15 +1433,15 @@ data:
|
||||
},
|
||||
"id": 3,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -1510,15 +1537,15 @@ data:
|
||||
},
|
||||
"id": 4,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -1614,15 +1641,15 @@ data:
|
||||
},
|
||||
"id": 5,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -1934,15 +1961,15 @@ data:
|
||||
},
|
||||
"id": 8,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -2316,8 +2343,8 @@ data:
|
||||
"list": [
|
||||
{
|
||||
"current": {
|
||||
"text": "Prometheus",
|
||||
"value": "Prometheus"
|
||||
"text": "default",
|
||||
"value": "default"
|
||||
},
|
||||
"hide": 0,
|
||||
"label": null,
|
||||
@ -2413,7 +2440,7 @@ data:
|
||||
"links": [
|
||||
|
||||
],
|
||||
"refresh": "",
|
||||
"refresh": "10s",
|
||||
"rows": [
|
||||
{
|
||||
"collapse": false,
|
||||
@ -2815,8 +2842,8 @@ data:
|
||||
"list": [
|
||||
{
|
||||
"current": {
|
||||
"text": "Prometheus",
|
||||
"value": "Prometheus"
|
||||
"text": "default",
|
||||
"value": "default"
|
||||
},
|
||||
"hide": 0,
|
||||
"label": null,
|
||||
@ -2943,664 +2970,6 @@ data:
|
||||
"uid": "919b92a8e8041bd567af9edab12c840c",
|
||||
"version": 0
|
||||
}
|
||||
pods.json: |-
|
||||
{
|
||||
"__inputs": [
|
||||
|
||||
],
|
||||
"__requires": [
|
||||
|
||||
],
|
||||
"annotations": {
|
||||
"list": [
|
||||
{
|
||||
"builtIn": 1,
|
||||
"datasource": "$datasource",
|
||||
"enable": true,
|
||||
"expr": "time() == BOOL timestamp(rate(kube_pod_container_status_restarts_total{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}[2m]) > 0)",
|
||||
"hide": false,
|
||||
"iconColor": "rgba(215, 44, 44, 1)",
|
||||
"name": "Restarts",
|
||||
"showIn": 0,
|
||||
"tags": [
|
||||
"restart"
|
||||
],
|
||||
"type": "rows"
|
||||
}
|
||||
]
|
||||
},
|
||||
"editable": false,
|
||||
"gnetId": null,
|
||||
"graphTooltip": 0,
|
||||
"hideControls": false,
|
||||
"id": null,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"refresh": "",
|
||||
"rows": [
|
||||
{
|
||||
"collapse": false,
|
||||
"collapsed": false,
|
||||
"panels": [
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 2,
|
||||
"legend": {
|
||||
"alignAsTable": true,
|
||||
"avg": true,
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 12,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum by(container) (container_memory_usage_bytes{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\", container!=\"POD\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "Current: {{ container }}",
|
||||
"refId": "A"
|
||||
},
|
||||
{
|
||||
"expr": "sum by(container) (kube_pod_container_resource_requests{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"memory\", pod=\"$pod\", container=~\"$container\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "Requested: {{ container }}",
|
||||
"refId": "B"
|
||||
},
|
||||
{
|
||||
"expr": "sum by(container) (kube_pod_container_resource_limits{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"memory\", pod=\"$pod\", container=~\"$container\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "Limit: {{ container }}",
|
||||
"refId": "C"
|
||||
},
|
||||
{
|
||||
"expr": "sum by(container) (container_memory_cache{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$pod\", container=~\"$container\", container!=\"POD\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "Cache: {{ container }}",
|
||||
"refId": "D"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "Memory Usage",
|
||||
"tooltip": {
|
||||
"shared": false,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "bytes",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "bytes",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"repeat": null,
|
||||
"repeatIteration": null,
|
||||
"repeatRowId": null,
|
||||
"showTitle": false,
|
||||
"title": "Dashboard Row",
|
||||
"titleSize": "h6",
|
||||
"type": "row"
|
||||
},
|
||||
{
|
||||
"collapse": false,
|
||||
"collapsed": false,
|
||||
"panels": [
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 3,
|
||||
"legend": {
|
||||
"alignAsTable": true,
|
||||
"avg": true,
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 12,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum by (container) (irate(container_cpu_usage_seconds_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", image!=\"\", pod=\"$pod\", container=~\"$container\", container!=\"POD\"}[4m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "Current: {{ container }}",
|
||||
"refId": "A"
|
||||
},
|
||||
{
|
||||
"expr": "sum by(container) (kube_pod_container_resource_requests{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"cpu\", pod=\"$pod\", container=~\"$container\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "Requested: {{ container }}",
|
||||
"refId": "B"
|
||||
},
|
||||
{
|
||||
"expr": "sum by(container) (kube_pod_container_resource_limits{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", resource=\"cpu\", pod=\"$pod\", container=~\"$container\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "Limit: {{ container }}",
|
||||
"refId": "C"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "CPU Usage",
|
||||
"tooltip": {
|
||||
"shared": false,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"repeat": null,
|
||||
"repeatIteration": null,
|
||||
"repeatRowId": null,
|
||||
"showTitle": false,
|
||||
"title": "Dashboard Row",
|
||||
"titleSize": "h6",
|
||||
"type": "row"
|
||||
},
|
||||
{
|
||||
"collapse": false,
|
||||
"collapsed": false,
|
||||
"panels": [
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 4,
|
||||
"legend": {
|
||||
"alignAsTable": true,
|
||||
"avg": true,
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 12,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sort_desc(sum by (pod) (irate(container_network_receive_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}[4m])))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "RX: {{ pod }}",
|
||||
"refId": "A"
|
||||
},
|
||||
{
|
||||
"expr": "sort_desc(sum by (pod) (irate(container_network_transmit_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}[4m])))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "TX: {{ pod }}",
|
||||
"refId": "B"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "Network I/O",
|
||||
"tooltip": {
|
||||
"shared": false,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "bytes",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "bytes",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"repeat": null,
|
||||
"repeatIteration": null,
|
||||
"repeatRowId": null,
|
||||
"showTitle": false,
|
||||
"title": "Dashboard Row",
|
||||
"titleSize": "h6",
|
||||
"type": "row"
|
||||
},
|
||||
{
|
||||
"collapse": false,
|
||||
"collapsed": false,
|
||||
"panels": [
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 5,
|
||||
"legend": {
|
||||
"alignAsTable": true,
|
||||
"avg": true,
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 12,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "max by (container) (kube_pod_container_status_restarts_total{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "Restarts: {{ container }}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "Total Restarts Per Container",
|
||||
"tooltip": {
|
||||
"shared": false,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"repeat": null,
|
||||
"repeatIteration": null,
|
||||
"repeatRowId": null,
|
||||
"showTitle": false,
|
||||
"title": "Dashboard Row",
|
||||
"titleSize": "h6",
|
||||
"type": "row"
|
||||
}
|
||||
],
|
||||
"schemaVersion": 14,
|
||||
"style": "dark",
|
||||
"tags": [
|
||||
"kubernetes-mixin"
|
||||
],
|
||||
"templating": {
|
||||
"list": [
|
||||
{
|
||||
"current": {
|
||||
"text": "Prometheus",
|
||||
"value": "Prometheus"
|
||||
},
|
||||
"hide": 0,
|
||||
"label": null,
|
||||
"name": "datasource",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "prometheus",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"type": "datasource"
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": "cluster",
|
||||
"multi": false,
|
||||
"name": "cluster",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_info, cluster)",
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": "Namespace",
|
||||
"multi": false,
|
||||
"name": "namespace",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_info{cluster=\"$cluster\"}, namespace)",
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": "Pod",
|
||||
"multi": false,
|
||||
"name": "pod",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_info{cluster=\"$cluster\", namespace=~\"$namespace\"}, pod)",
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": true,
|
||||
"label": "Container",
|
||||
"multi": false,
|
||||
"name": "container",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_container_info{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"}, container)",
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
}
|
||||
]
|
||||
},
|
||||
"time": {
|
||||
"from": "now-1h",
|
||||
"to": "now"
|
||||
},
|
||||
"timepicker": {
|
||||
"refresh_intervals": [
|
||||
"5s",
|
||||
"10s",
|
||||
"30s",
|
||||
"1m",
|
||||
"5m",
|
||||
"15m",
|
||||
"30m",
|
||||
"1h",
|
||||
"2h",
|
||||
"1d"
|
||||
],
|
||||
"time_options": [
|
||||
"5m",
|
||||
"15m",
|
||||
"1h",
|
||||
"6h",
|
||||
"12h",
|
||||
"24h",
|
||||
"2d",
|
||||
"7d",
|
||||
"30d"
|
||||
]
|
||||
},
|
||||
"timezone": "",
|
||||
"title": "Kubernetes / Pods",
|
||||
"uid": "ab4f13a9892a76a4d21ce8c2445bf4ea",
|
||||
"version": 0
|
||||
}
|
||||
scheduler.json: |-
|
||||
{
|
||||
"__inputs": [
|
||||
@ -3622,7 +2991,7 @@ data:
|
||||
"links": [
|
||||
|
||||
],
|
||||
"refresh": "",
|
||||
"refresh": "10s",
|
||||
"rows": [
|
||||
{
|
||||
"collapse": false,
|
||||
@ -3726,15 +3095,15 @@ data:
|
||||
},
|
||||
"id": 3,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -3838,15 +3207,15 @@ data:
|
||||
},
|
||||
"id": 4,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -4179,15 +3548,15 @@ data:
|
||||
},
|
||||
"id": 7,
|
||||
"legend": {
|
||||
"alignAsTable": "true",
|
||||
"alignAsTable": true,
|
||||
"avg": false,
|
||||
"current": "true",
|
||||
"current": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": "true",
|
||||
"show": "true",
|
||||
"rightSide": true,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": "true"
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
@ -4561,8 +3930,8 @@ data:
|
||||
"list": [
|
||||
{
|
||||
"current": {
|
||||
"text": "Prometheus",
|
||||
"value": "Prometheus"
|
||||
"text": "default",
|
||||
"value": "default"
|
||||
},
|
||||
"hide": 0,
|
||||
"label": null,
|
||||
@ -4637,910 +4006,6 @@ data:
|
||||
"uid": "2e6b6a3b4bddf1427b3a55aa1311c656",
|
||||
"version": 0
|
||||
}
|
||||
statefulset.json: |-
|
||||
{
|
||||
"__inputs": [
|
||||
|
||||
],
|
||||
"__requires": [
|
||||
|
||||
],
|
||||
"annotations": {
|
||||
"list": [
|
||||
|
||||
]
|
||||
},
|
||||
"editable": false,
|
||||
"gnetId": null,
|
||||
"graphTooltip": 0,
|
||||
"hideControls": false,
|
||||
"id": null,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"refresh": "",
|
||||
"rows": [
|
||||
{
|
||||
"collapse": false,
|
||||
"collapsed": false,
|
||||
"panels": [
|
||||
{
|
||||
"cacheTimeout": null,
|
||||
"colorBackground": false,
|
||||
"colorValue": false,
|
||||
"colors": [
|
||||
"#299c46",
|
||||
"rgba(237, 129, 40, 0.89)",
|
||||
"#d44a3a"
|
||||
],
|
||||
"datasource": "$datasource",
|
||||
"format": "none",
|
||||
"gauge": {
|
||||
"maxValue": 100,
|
||||
"minValue": 0,
|
||||
"show": false,
|
||||
"thresholdLabels": false,
|
||||
"thresholdMarkers": true
|
||||
},
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 2,
|
||||
"interval": null,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"mappingType": 1,
|
||||
"mappingTypes": [
|
||||
{
|
||||
"name": "value to text",
|
||||
"value": 1
|
||||
},
|
||||
{
|
||||
"name": "range to text",
|
||||
"value": 2
|
||||
}
|
||||
],
|
||||
"maxDataPoints": 100,
|
||||
"nullPointMode": "connected",
|
||||
"nullText": null,
|
||||
"postfix": "cores",
|
||||
"postfixFontSize": "50%",
|
||||
"prefix": "",
|
||||
"prefixFontSize": "50%",
|
||||
"rangeMaps": [
|
||||
{
|
||||
"from": "null",
|
||||
"text": "N/A",
|
||||
"to": "null"
|
||||
}
|
||||
],
|
||||
"span": 4,
|
||||
"sparkline": {
|
||||
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||
"lineColor": "rgb(31, 120, 193)",
|
||||
"show": true
|
||||
},
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(rate(container_cpu_usage_seconds_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$statefulset.*\"}[3m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": "",
|
||||
"title": "CPU",
|
||||
"tooltip": {
|
||||
"shared": false
|
||||
},
|
||||
"type": "singlestat",
|
||||
"valueFontSize": "80%",
|
||||
"valueMaps": [
|
||||
{
|
||||
"op": "=",
|
||||
"text": "0",
|
||||
"value": "null"
|
||||
}
|
||||
],
|
||||
"valueName": "current"
|
||||
},
|
||||
{
|
||||
"cacheTimeout": null,
|
||||
"colorBackground": false,
|
||||
"colorValue": false,
|
||||
"colors": [
|
||||
"#299c46",
|
||||
"rgba(237, 129, 40, 0.89)",
|
||||
"#d44a3a"
|
||||
],
|
||||
"datasource": "$datasource",
|
||||
"format": "none",
|
||||
"gauge": {
|
||||
"maxValue": 100,
|
||||
"minValue": 0,
|
||||
"show": false,
|
||||
"thresholdLabels": false,
|
||||
"thresholdMarkers": true
|
||||
},
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 3,
|
||||
"interval": null,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"mappingType": 1,
|
||||
"mappingTypes": [
|
||||
{
|
||||
"name": "value to text",
|
||||
"value": 1
|
||||
},
|
||||
{
|
||||
"name": "range to text",
|
||||
"value": 2
|
||||
}
|
||||
],
|
||||
"maxDataPoints": 100,
|
||||
"nullPointMode": "connected",
|
||||
"nullText": null,
|
||||
"postfix": "GB",
|
||||
"postfixFontSize": "50%",
|
||||
"prefix": "",
|
||||
"prefixFontSize": "50%",
|
||||
"rangeMaps": [
|
||||
{
|
||||
"from": "null",
|
||||
"text": "N/A",
|
||||
"to": "null"
|
||||
}
|
||||
],
|
||||
"span": 4,
|
||||
"sparkline": {
|
||||
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||
"lineColor": "rgb(31, 120, 193)",
|
||||
"show": true
|
||||
},
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(container_memory_usage_bytes{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$statefulset.*\"}) / 1024^3",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": "",
|
||||
"title": "Memory",
|
||||
"tooltip": {
|
||||
"shared": false
|
||||
},
|
||||
"type": "singlestat",
|
||||
"valueFontSize": "80%",
|
||||
"valueMaps": [
|
||||
{
|
||||
"op": "=",
|
||||
"text": "0",
|
||||
"value": "null"
|
||||
}
|
||||
],
|
||||
"valueName": "current"
|
||||
},
|
||||
{
|
||||
"cacheTimeout": null,
|
||||
"colorBackground": false,
|
||||
"colorValue": false,
|
||||
"colors": [
|
||||
"#299c46",
|
||||
"rgba(237, 129, 40, 0.89)",
|
||||
"#d44a3a"
|
||||
],
|
||||
"datasource": "$datasource",
|
||||
"format": "none",
|
||||
"gauge": {
|
||||
"maxValue": 100,
|
||||
"minValue": 0,
|
||||
"show": false,
|
||||
"thresholdLabels": false,
|
||||
"thresholdMarkers": true
|
||||
},
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 4,
|
||||
"interval": null,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"mappingType": 1,
|
||||
"mappingTypes": [
|
||||
{
|
||||
"name": "value to text",
|
||||
"value": 1
|
||||
},
|
||||
{
|
||||
"name": "range to text",
|
||||
"value": 2
|
||||
}
|
||||
],
|
||||
"maxDataPoints": 100,
|
||||
"nullPointMode": "connected",
|
||||
"nullText": null,
|
||||
"postfix": "Bps",
|
||||
"postfixFontSize": "50%",
|
||||
"prefix": "",
|
||||
"prefixFontSize": "50%",
|
||||
"rangeMaps": [
|
||||
{
|
||||
"from": "null",
|
||||
"text": "N/A",
|
||||
"to": "null"
|
||||
}
|
||||
],
|
||||
"span": 4,
|
||||
"sparkline": {
|
||||
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||
"lineColor": "rgb(31, 120, 193)",
|
||||
"show": true
|
||||
},
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(rate(container_network_transmit_bytes_total{job=\"kubernetes-cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=~\"$statefulset.*\"}[3m])) + sum(rate(container_network_receive_bytes_total{cluster=\"$cluster\", namespace=\"$namespace\",pod=~\"$statefulset.*\"}[3m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": "",
|
||||
"title": "Network",
|
||||
"tooltip": {
|
||||
"shared": false
|
||||
},
|
||||
"type": "singlestat",
|
||||
"valueFontSize": "80%",
|
||||
"valueMaps": [
|
||||
{
|
||||
"op": "=",
|
||||
"text": "0",
|
||||
"value": "null"
|
||||
}
|
||||
],
|
||||
"valueName": "current"
|
||||
}
|
||||
],
|
||||
"repeat": null,
|
||||
"repeatIteration": null,
|
||||
"repeatRowId": null,
|
||||
"showTitle": false,
|
||||
"title": "Dashboard Row",
|
||||
"titleSize": "h6",
|
||||
"type": "row"
|
||||
},
|
||||
{
|
||||
"collapse": false,
|
||||
"collapsed": false,
|
||||
"height": "100px",
|
||||
"panels": [
|
||||
{
|
||||
"cacheTimeout": null,
|
||||
"colorBackground": false,
|
||||
"colorValue": false,
|
||||
"colors": [
|
||||
"#299c46",
|
||||
"rgba(237, 129, 40, 0.89)",
|
||||
"#d44a3a"
|
||||
],
|
||||
"datasource": "$datasource",
|
||||
"format": "none",
|
||||
"gauge": {
|
||||
"maxValue": 100,
|
||||
"minValue": 0,
|
||||
"show": false,
|
||||
"thresholdLabels": false,
|
||||
"thresholdMarkers": true
|
||||
},
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 5,
|
||||
"interval": null,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"mappingType": 1,
|
||||
"mappingTypes": [
|
||||
{
|
||||
"name": "value to text",
|
||||
"value": 1
|
||||
},
|
||||
{
|
||||
"name": "range to text",
|
||||
"value": 2
|
||||
}
|
||||
],
|
||||
"maxDataPoints": 100,
|
||||
"nullPointMode": "connected",
|
||||
"nullText": null,
|
||||
"postfix": "",
|
||||
"postfixFontSize": "50%",
|
||||
"prefix": "",
|
||||
"prefixFontSize": "50%",
|
||||
"rangeMaps": [
|
||||
{
|
||||
"from": "null",
|
||||
"text": "N/A",
|
||||
"to": "null"
|
||||
}
|
||||
],
|
||||
"span": 3,
|
||||
"sparkline": {
|
||||
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||
"full": false,
|
||||
"lineColor": "rgb(31, 120, 193)",
|
||||
"show": false
|
||||
},
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "max(kube_statefulset_replicas{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", statefulset=\"$statefulset\"}) without (instance, pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": "",
|
||||
"title": "Desired Replicas",
|
||||
"tooltip": {
|
||||
"shared": false
|
||||
},
|
||||
"type": "singlestat",
|
||||
"valueFontSize": "80%",
|
||||
"valueMaps": [
|
||||
{
|
||||
"op": "=",
|
||||
"text": "0",
|
||||
"value": "null"
|
||||
}
|
||||
],
|
||||
"valueName": "current"
|
||||
},
|
||||
{
|
||||
"cacheTimeout": null,
|
||||
"colorBackground": false,
|
||||
"colorValue": false,
|
||||
"colors": [
|
||||
"#299c46",
|
||||
"rgba(237, 129, 40, 0.89)",
|
||||
"#d44a3a"
|
||||
],
|
||||
"datasource": "$datasource",
|
||||
"format": "none",
|
||||
"gauge": {
|
||||
"maxValue": 100,
|
||||
"minValue": 0,
|
||||
"show": false,
|
||||
"thresholdLabels": false,
|
||||
"thresholdMarkers": true
|
||||
},
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 6,
|
||||
"interval": null,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"mappingType": 1,
|
||||
"mappingTypes": [
|
||||
{
|
||||
"name": "value to text",
|
||||
"value": 1
|
||||
},
|
||||
{
|
||||
"name": "range to text",
|
||||
"value": 2
|
||||
}
|
||||
],
|
||||
"maxDataPoints": 100,
|
||||
"nullPointMode": "connected",
|
||||
"nullText": null,
|
||||
"postfix": "",
|
||||
"postfixFontSize": "50%",
|
||||
"prefix": "",
|
||||
"prefixFontSize": "50%",
|
||||
"rangeMaps": [
|
||||
{
|
||||
"from": "null",
|
||||
"text": "N/A",
|
||||
"to": "null"
|
||||
}
|
||||
],
|
||||
"span": 3,
|
||||
"sparkline": {
|
||||
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||
"full": false,
|
||||
"lineColor": "rgb(31, 120, 193)",
|
||||
"show": false
|
||||
},
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "min(kube_statefulset_status_replicas_current{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", statefulset=\"$statefulset\"}) without (instance, pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": "",
|
||||
"title": "Replicas of current version",
|
||||
"tooltip": {
|
||||
"shared": false
|
||||
},
|
||||
"type": "singlestat",
|
||||
"valueFontSize": "80%",
|
||||
"valueMaps": [
|
||||
{
|
||||
"op": "=",
|
||||
"text": "0",
|
||||
"value": "null"
|
||||
}
|
||||
],
|
||||
"valueName": "current"
|
||||
},
|
||||
{
|
||||
"cacheTimeout": null,
|
||||
"colorBackground": false,
|
||||
"colorValue": false,
|
||||
"colors": [
|
||||
"#299c46",
|
||||
"rgba(237, 129, 40, 0.89)",
|
||||
"#d44a3a"
|
||||
],
|
||||
"datasource": "$datasource",
|
||||
"format": "none",
|
||||
"gauge": {
|
||||
"maxValue": 100,
|
||||
"minValue": 0,
|
||||
"show": false,
|
||||
"thresholdLabels": false,
|
||||
"thresholdMarkers": true
|
||||
},
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 7,
|
||||
"interval": null,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"mappingType": 1,
|
||||
"mappingTypes": [
|
||||
{
|
||||
"name": "value to text",
|
||||
"value": 1
|
||||
},
|
||||
{
|
||||
"name": "range to text",
|
||||
"value": 2
|
||||
}
|
||||
],
|
||||
"maxDataPoints": 100,
|
||||
"nullPointMode": "connected",
|
||||
"nullText": null,
|
||||
"postfix": "",
|
||||
"postfixFontSize": "50%",
|
||||
"prefix": "",
|
||||
"prefixFontSize": "50%",
|
||||
"rangeMaps": [
|
||||
{
|
||||
"from": "null",
|
||||
"text": "N/A",
|
||||
"to": "null"
|
||||
}
|
||||
],
|
||||
"span": 3,
|
||||
"sparkline": {
|
||||
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||
"full": false,
|
||||
"lineColor": "rgb(31, 120, 193)",
|
||||
"show": false
|
||||
},
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "max(kube_statefulset_status_observed_generation{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", statefulset=\"$statefulset\"}) without (instance, pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": "",
|
||||
"title": "Observed Generation",
|
||||
"tooltip": {
|
||||
"shared": false
|
||||
},
|
||||
"type": "singlestat",
|
||||
"valueFontSize": "80%",
|
||||
"valueMaps": [
|
||||
{
|
||||
"op": "=",
|
||||
"text": "0",
|
||||
"value": "null"
|
||||
}
|
||||
],
|
||||
"valueName": "current"
|
||||
},
|
||||
{
|
||||
"cacheTimeout": null,
|
||||
"colorBackground": false,
|
||||
"colorValue": false,
|
||||
"colors": [
|
||||
"#299c46",
|
||||
"rgba(237, 129, 40, 0.89)",
|
||||
"#d44a3a"
|
||||
],
|
||||
"datasource": "$datasource",
|
||||
"format": "none",
|
||||
"gauge": {
|
||||
"maxValue": 100,
|
||||
"minValue": 0,
|
||||
"show": false,
|
||||
"thresholdLabels": false,
|
||||
"thresholdMarkers": true
|
||||
},
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 8,
|
||||
"interval": null,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"mappingType": 1,
|
||||
"mappingTypes": [
|
||||
{
|
||||
"name": "value to text",
|
||||
"value": 1
|
||||
},
|
||||
{
|
||||
"name": "range to text",
|
||||
"value": 2
|
||||
}
|
||||
],
|
||||
"maxDataPoints": 100,
|
||||
"nullPointMode": "connected",
|
||||
"nullText": null,
|
||||
"postfix": "",
|
||||
"postfixFontSize": "50%",
|
||||
"prefix": "",
|
||||
"prefixFontSize": "50%",
|
||||
"rangeMaps": [
|
||||
{
|
||||
"from": "null",
|
||||
"text": "N/A",
|
||||
"to": "null"
|
||||
}
|
||||
],
|
||||
"span": 3,
|
||||
"sparkline": {
|
||||
"fillColor": "rgba(31, 118, 189, 0.18)",
|
||||
"full": false,
|
||||
"lineColor": "rgb(31, 120, 193)",
|
||||
"show": false
|
||||
},
|
||||
"tableColumn": "",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "max(kube_statefulset_metadata_generation{job=\"kube-state-metrics\", statefulset=\"$statefulset\", cluster=\"$cluster\", namespace=\"$namespace\"}) without (instance, pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": "",
|
||||
"title": "Metadata Generation",
|
||||
"tooltip": {
|
||||
"shared": false
|
||||
},
|
||||
"type": "singlestat",
|
||||
"valueFontSize": "80%",
|
||||
"valueMaps": [
|
||||
{
|
||||
"op": "=",
|
||||
"text": "0",
|
||||
"value": "null"
|
||||
}
|
||||
],
|
||||
"valueName": "current"
|
||||
}
|
||||
],
|
||||
"repeat": null,
|
||||
"repeatIteration": null,
|
||||
"repeatRowId": null,
|
||||
"showTitle": false,
|
||||
"title": "Dashboard Row",
|
||||
"titleSize": "h6",
|
||||
"type": "row"
|
||||
},
|
||||
{
|
||||
"collapse": false,
|
||||
"collapsed": false,
|
||||
"panels": [
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 9,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "max(kube_statefulset_replicas{job=\"kube-state-metrics\", statefulset=\"$statefulset\", cluster=\"$cluster\", namespace=\"$namespace\"}) without (instance, pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "replicas specified",
|
||||
"refId": "A"
|
||||
},
|
||||
{
|
||||
"expr": "max(kube_statefulset_status_replicas{job=\"kube-state-metrics\", statefulset=\"$statefulset\", cluster=\"$cluster\", namespace=\"$namespace\"}) without (instance, pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "replicas created",
|
||||
"refId": "B"
|
||||
},
|
||||
{
|
||||
"expr": "min(kube_statefulset_status_replicas_ready{job=\"kube-state-metrics\", statefulset=\"$statefulset\", cluster=\"$cluster\", namespace=\"$namespace\"}) without (instance, pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "ready",
|
||||
"refId": "C"
|
||||
},
|
||||
{
|
||||
"expr": "min(kube_statefulset_status_replicas_current{job=\"kube-state-metrics\", statefulset=\"$statefulset\", cluster=\"$cluster\", namespace=\"$namespace\"}) without (instance, pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "replicas of current version",
|
||||
"refId": "D"
|
||||
},
|
||||
{
|
||||
"expr": "min(kube_statefulset_status_replicas_updated{job=\"kube-state-metrics\", statefulset=\"$statefulset\", cluster=\"$cluster\", namespace=\"$namespace\"}) without (instance, pod)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "updated",
|
||||
"refId": "E"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "Replicas",
|
||||
"tooltip": {
|
||||
"shared": false,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"repeat": null,
|
||||
"repeatIteration": null,
|
||||
"repeatRowId": null,
|
||||
"showTitle": false,
|
||||
"title": "Dashboard Row",
|
||||
"titleSize": "h6",
|
||||
"type": "row"
|
||||
}
|
||||
],
|
||||
"schemaVersion": 14,
|
||||
"style": "dark",
|
||||
"tags": [
|
||||
"kubernetes-mixin"
|
||||
],
|
||||
"templating": {
|
||||
"list": [
|
||||
{
|
||||
"current": {
|
||||
"text": "Prometheus",
|
||||
"value": "Prometheus"
|
||||
},
|
||||
"hide": 0,
|
||||
"label": null,
|
||||
"name": "datasource",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "prometheus",
|
||||
"refresh": 1,
|
||||
"regex": "",
|
||||
"type": "datasource"
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 2,
|
||||
"includeAll": false,
|
||||
"label": "cluster",
|
||||
"multi": false,
|
||||
"name": "cluster",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_statefulset_metadata_generation, cluster)",
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": "Namespace",
|
||||
"multi": false,
|
||||
"name": "namespace",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_statefulset_metadata_generation{job=\"kube-state-metrics\", cluster=\"$cluster\"}, namespace)",
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": "Name",
|
||||
"multi": false,
|
||||
"name": "statefulset",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_statefulset_metadata_generation{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\"}, statefulset)",
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
}
|
||||
]
|
||||
},
|
||||
"time": {
|
||||
"from": "now-1h",
|
||||
"to": "now"
|
||||
},
|
||||
"timepicker": {
|
||||
"refresh_intervals": [
|
||||
"5s",
|
||||
"10s",
|
||||
"30s",
|
||||
"1m",
|
||||
"5m",
|
||||
"15m",
|
||||
"30m",
|
||||
"1h",
|
||||
"2h",
|
||||
"1d"
|
||||
],
|
||||
"time_options": [
|
||||
"5m",
|
||||
"15m",
|
||||
"1h",
|
||||
"6h",
|
||||
"12h",
|
||||
"24h",
|
||||
"2d",
|
||||
"7d",
|
||||
"30d"
|
||||
]
|
||||
},
|
||||
"timezone": "",
|
||||
"title": "Kubernetes / StatefulSets",
|
||||
"uid": "a31c1f46e6f727cb37c0d731a7245005",
|
||||
"version": 0
|
||||
}
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: grafana-dashboards-k8s
|
||||
|
@ -2,6 +2,12 @@ apiVersion: v1
|
||||
data:
|
||||
prometheus-remote-write.json: |-
|
||||
{
|
||||
"__inputs": [
|
||||
|
||||
],
|
||||
"__requires": [
|
||||
|
||||
],
|
||||
"annotations": {
|
||||
"list": [
|
||||
|
||||
@ -11,14 +17,15 @@ data:
|
||||
"gnetId": null,
|
||||
"graphTooltip": 0,
|
||||
"hideControls": false,
|
||||
"id": null,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"refresh": "10s",
|
||||
"refresh": "",
|
||||
"rows": [
|
||||
{
|
||||
"collapse": false,
|
||||
"height": "250px",
|
||||
"collapsed": false,
|
||||
"panels": [
|
||||
{
|
||||
"aliasColors": {
|
||||
@ -29,12 +36,17 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"id": 1,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 2,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
@ -44,11 +56,12 @@ data:
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null as zero",
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
@ -58,12 +71,11 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "prometheus_remote_storage_highest_timestamp_in_seconds{cluster=~\"$cluster\", instance=~\"$instance\"} - ignoring(queue) group_right(instance) prometheus_remote_storage_queue_highest_sent_timestamp_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}",
|
||||
"expr": "(\n prometheus_remote_storage_highest_timestamp_in_seconds{cluster=~\"$cluster\", instance=~\"$instance\"} \n- \n ignoring(queue) group_right(instance) prometheus_remote_storage_queue_highest_sent_timestamp_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}\n)\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
@ -89,11 +101,11 @@ data:
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "s",
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"min": null,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
@ -102,7 +114,7 @@ data:
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": false
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
},
|
||||
@ -115,12 +127,17 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"id": 2,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 3,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
@ -130,11 +147,12 @@ data:
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null as zero",
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
@ -144,12 +162,11 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "rate(prometheus_remote_storage_highest_timestamp_in_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) - ignoring (queue) group_right(instance) rate(prometheus_remote_storage_queue_highest_sent_timestamp_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])",
|
||||
"expr": "(\n rate(prometheus_remote_storage_highest_timestamp_in_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) \n- \n ignoring (queue) group_right(instance) rate(prometheus_remote_storage_queue_highest_sent_timestamp_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n)\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
@ -179,7 +196,7 @@ data:
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"min": null,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
@ -188,7 +205,7 @@ data:
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": false
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
}
|
||||
@ -198,11 +215,12 @@ data:
|
||||
"repeatRowId": null,
|
||||
"showTitle": true,
|
||||
"title": "Timestamps",
|
||||
"titleSize": "h6"
|
||||
"titleSize": "h6",
|
||||
"type": "row"
|
||||
},
|
||||
{
|
||||
"collapse": false,
|
||||
"height": "250px",
|
||||
"collapsed": false,
|
||||
"panels": [
|
||||
{
|
||||
"aliasColors": {
|
||||
@ -213,12 +231,17 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"id": 3,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 4,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
@ -228,11 +251,12 @@ data:
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null as zero",
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
@ -242,12 +266,11 @@ data:
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "rate(prometheus_remote_storage_samples_in_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])- ignoring(queue) group_right(instance) rate(prometheus_remote_storage_succeeded_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) - rate(prometheus_remote_storage_dropped_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])",
|
||||
"expr": "rate(\n prometheus_remote_storage_samples_in_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n- \n ignoring(queue) group_right(instance) rate(prometheus_remote_storage_succeeded_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) \n- \n rate(prometheus_remote_storage_dropped_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
@ -277,7 +300,7 @@ data:
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"min": null,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
@ -286,7 +309,7 @@ data:
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": false
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
}
|
||||
@ -296,11 +319,12 @@ data:
|
||||
"repeatRowId": null,
|
||||
"showTitle": true,
|
||||
"title": "Samples",
|
||||
"titleSize": "h6"
|
||||
"titleSize": "h6",
|
||||
"type": "row"
|
||||
},
|
||||
{
|
||||
"collapse": false,
|
||||
"height": "250px",
|
||||
"collapsed": false,
|
||||
"panels": [
|
||||
{
|
||||
"aliasColors": {
|
||||
@ -311,12 +335,17 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"id": 4,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 5,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
@ -326,16 +355,18 @@ data:
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null as zero",
|
||||
"minSpan": 6,
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 6,
|
||||
"span": 12,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
@ -344,8 +375,7 @@ data:
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
@ -353,7 +383,7 @@ data:
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "Num. Shards",
|
||||
"title": "Current Shards",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 0,
|
||||
@ -375,7 +405,7 @@ data:
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"min": null,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
@ -384,7 +414,7 @@ data:
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": false
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
},
|
||||
@ -397,12 +427,17 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"id": 5,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 6,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
@ -412,11 +447,298 @@ data:
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null as zero",
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 4,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "prometheus_remote_storage_shards_max{cluster=~\"$cluster\", instance=~\"$instance\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "Max Shards",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 7,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 4,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "prometheus_remote_storage_shards_min{cluster=~\"$cluster\", instance=~\"$instance\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "Min Shards",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 8,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 4,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "prometheus_remote_storage_shards_desired{cluster=~\"$cluster\", instance=~\"$instance\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "Desired Shards",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"repeat": null,
|
||||
"repeatIteration": null,
|
||||
"repeatRowId": null,
|
||||
"showTitle": true,
|
||||
"title": "Shards",
|
||||
"titleSize": "h6",
|
||||
"type": "row"
|
||||
},
|
||||
{
|
||||
"collapse": false,
|
||||
"collapsed": false,
|
||||
"panels": [
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 9,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
@ -430,8 +752,7 @@ data:
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
@ -439,7 +760,7 @@ data:
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "Capacity",
|
||||
"title": "Shard Capacity",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 0,
|
||||
@ -461,7 +782,7 @@ data:
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"min": null,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
@ -470,7 +791,98 @@ data:
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": false
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 10,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 6,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "prometheus_remote_storage_pending_samples{cluster=~\"$cluster\", instance=~\"$instance\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "Pending Samples",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
}
|
||||
@ -479,12 +891,13 @@ data:
|
||||
"repeatIteration": null,
|
||||
"repeatRowId": null,
|
||||
"showTitle": true,
|
||||
"title": "Shards",
|
||||
"titleSize": "h6"
|
||||
"title": "Shard Details",
|
||||
"titleSize": "h6",
|
||||
"type": "row"
|
||||
},
|
||||
{
|
||||
"collapse": false,
|
||||
"height": "250px",
|
||||
"collapsed": false,
|
||||
"panels": [
|
||||
{
|
||||
"aliasColors": {
|
||||
@ -495,12 +908,17 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"id": 6,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 11,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
@ -510,11 +928,207 @@ data:
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null as zero",
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 6,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "prometheus_tsdb_wal_segment_current{cluster=~\"$cluster\", instance=~\"$instance\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "TSDB Current Segment",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "none",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 12,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"span": 6,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "prometheus_wal_watcher_current_segment{cluster=~\"$cluster\", instance=~\"$instance\"}",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
|
||||
],
|
||||
"timeFrom": null,
|
||||
"timeShift": null,
|
||||
"title": "Remote Write Current Segment",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": [
|
||||
|
||||
]
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "none",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"repeat": null,
|
||||
"repeatIteration": null,
|
||||
"repeatRowId": null,
|
||||
"showTitle": true,
|
||||
"title": "Segments",
|
||||
"titleSize": "h6",
|
||||
"type": "row"
|
||||
},
|
||||
{
|
||||
"collapse": false,
|
||||
"collapsed": false,
|
||||
"panels": [
|
||||
{
|
||||
"aliasColors": {
|
||||
|
||||
},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 13,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
@ -528,8 +1142,7 @@ data:
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
@ -559,7 +1172,7 @@ data:
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"min": null,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
@ -568,7 +1181,7 @@ data:
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": false
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
},
|
||||
@ -581,12 +1194,17 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"id": 7,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 14,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
@ -596,11 +1214,12 @@ data:
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null as zero",
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
@ -614,8 +1233,7 @@ data:
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
@ -645,7 +1263,7 @@ data:
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"min": null,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
@ -654,7 +1272,7 @@ data:
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": false
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
},
|
||||
@ -667,12 +1285,17 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"id": 8,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 15,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
@ -682,11 +1305,12 @@ data:
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null as zero",
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
@ -700,8 +1324,7 @@ data:
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
@ -731,7 +1354,7 @@ data:
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"min": null,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
@ -740,7 +1363,7 @@ data:
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": false
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
},
|
||||
@ -753,12 +1376,17 @@ data:
|
||||
"dashes": false,
|
||||
"datasource": "$datasource",
|
||||
"fill": 1,
|
||||
"id": 9,
|
||||
"gridPos": {
|
||||
|
||||
},
|
||||
"id": 16,
|
||||
"legend": {
|
||||
"alignAsTable": false,
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"rightSide": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
@ -768,11 +1396,12 @@ data:
|
||||
"links": [
|
||||
|
||||
],
|
||||
"nullPointMode": "null as zero",
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"repeat": null,
|
||||
"seriesOverrides": [
|
||||
|
||||
],
|
||||
@ -786,8 +1415,7 @@ data:
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "{{cluster}}:{{instance}}-{{queue}}",
|
||||
"legendLink": null,
|
||||
"step": 10
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [
|
||||
@ -817,7 +1445,7 @@ data:
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": 0,
|
||||
"min": null,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
@ -826,7 +1454,7 @@ data:
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": null,
|
||||
"show": false
|
||||
"show": true
|
||||
}
|
||||
]
|
||||
}
|
||||
@ -835,8 +1463,9 @@ data:
|
||||
"repeatIteration": null,
|
||||
"repeatRowId": null,
|
||||
"showTitle": true,
|
||||
"title": "Misc Rates.",
|
||||
"titleSize": "h6"
|
||||
"title": "Misc. Rates",
|
||||
"titleSize": "h6",
|
||||
"type": "row"
|
||||
}
|
||||
],
|
||||
"schemaVersion": 14,
|
||||
@ -847,10 +1476,6 @@ data:
|
||||
"templating": {
|
||||
"list": [
|
||||
{
|
||||
"current": {
|
||||
"text": "Prometheus",
|
||||
"value": "Prometheus"
|
||||
},
|
||||
"hide": 0,
|
||||
"label": null,
|
||||
"name": "datasource",
|
||||
@ -865,23 +1490,30 @@ data:
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"selected": true,
|
||||
"text": "All",
|
||||
"value": "$__all"
|
||||
"text": {
|
||||
"selected": true,
|
||||
"text": "All",
|
||||
"value": "$__all"
|
||||
},
|
||||
"value": {
|
||||
"selected": true,
|
||||
"text": "All",
|
||||
"value": "$__all"
|
||||
}
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": true,
|
||||
"label": "instance",
|
||||
"multi": true,
|
||||
"label": null,
|
||||
"multi": false,
|
||||
"name": "instance",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(prometheus_build_info, instance)",
|
||||
"refresh": 1,
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"sort": 2,
|
||||
"sort": 0,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
@ -893,23 +1525,56 @@ data:
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
"selected": true,
|
||||
"text": "All",
|
||||
"value": "$__all"
|
||||
"text": {
|
||||
"selected": true,
|
||||
"text": "All",
|
||||
"value": "$__all"
|
||||
},
|
||||
"value": {
|
||||
"selected": true,
|
||||
"text": "All",
|
||||
"value": "$__all"
|
||||
}
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": true,
|
||||
"label": "cluster",
|
||||
"multi": true,
|
||||
"label": null,
|
||||
"multi": false,
|
||||
"name": "cluster",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(kube_pod_container_info{image=~\".*prometheus.*\"}, cluster)",
|
||||
"refresh": 1,
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"sort": 2,
|
||||
"sort": 0,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
],
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
},
|
||||
{
|
||||
"allValue": null,
|
||||
"current": {
|
||||
|
||||
},
|
||||
"datasource": "$datasource",
|
||||
"hide": 0,
|
||||
"includeAll": true,
|
||||
"label": null,
|
||||
"multi": false,
|
||||
"name": "queue",
|
||||
"options": [
|
||||
|
||||
],
|
||||
"query": "label_values(prometheus_remote_storage_shards{cluster=~\"$cluster\", instance=~\"$instance\"}, queue)",
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"sort": 0,
|
||||
"tagValuesQuery": "",
|
||||
"tags": [
|
||||
|
||||
@ -921,7 +1586,7 @@ data:
|
||||
]
|
||||
},
|
||||
"time": {
|
||||
"from": "now-1h",
|
||||
"from": "now-6h",
|
||||
"to": "now"
|
||||
},
|
||||
"timepicker": {
|
||||
@ -949,9 +1614,8 @@ data:
|
||||
"30d"
|
||||
]
|
||||
},
|
||||
"timezone": "utc",
|
||||
"timezone": "browser",
|
||||
"title": "Prometheus Remote Write",
|
||||
"uid": "",
|
||||
"version": 0
|
||||
}
|
||||
prometheus.json: |-
|
||||
@ -2048,8 +2712,8 @@ data:
|
||||
"list": [
|
||||
{
|
||||
"current": {
|
||||
"text": "Prometheus",
|
||||
"value": "Prometheus"
|
||||
"text": "default",
|
||||
"value": "default"
|
||||
},
|
||||
"hide": 0,
|
||||
"label": null,
|
||||
|
@ -23,7 +23,7 @@ spec:
|
||||
spec:
|
||||
containers:
|
||||
- name: grafana
|
||||
image: docker.io/grafana/grafana:6.5.1
|
||||
image: docker.io/grafana/grafana:6.7.1
|
||||
env:
|
||||
- name: GF_PATHS_CONFIG
|
||||
value: "/etc/grafana/custom.ini"
|
||||
|
@ -22,7 +22,7 @@ spec:
|
||||
spec:
|
||||
containers:
|
||||
- name: nginx-ingress-controller
|
||||
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.26.1
|
||||
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.30.0
|
||||
args:
|
||||
- /nginx-ingress-controller
|
||||
- --ingress-class=public
|
||||
@ -76,6 +76,6 @@ spec:
|
||||
- NET_BIND_SERVICE
|
||||
drop:
|
||||
- ALL
|
||||
runAsUser: 33 # www-data
|
||||
runAsUser: 101 # www-data
|
||||
restartPolicy: Always
|
||||
terminationGracePeriodSeconds: 300
|
||||
|
@ -22,7 +22,7 @@ spec:
|
||||
spec:
|
||||
containers:
|
||||
- name: nginx-ingress-controller
|
||||
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.26.1
|
||||
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.30.0
|
||||
args:
|
||||
- /nginx-ingress-controller
|
||||
- --ingress-class=public
|
||||
@ -76,6 +76,6 @@ spec:
|
||||
- NET_BIND_SERVICE
|
||||
drop:
|
||||
- ALL
|
||||
runAsUser: 33 # www-data
|
||||
runAsUser: 101 # www-data
|
||||
restartPolicy: Always
|
||||
terminationGracePeriodSeconds: 300
|
||||
|
@ -22,7 +22,7 @@ spec:
|
||||
spec:
|
||||
containers:
|
||||
- name: nginx-ingress-controller
|
||||
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.26.1
|
||||
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.30.0
|
||||
args:
|
||||
- /nginx-ingress-controller
|
||||
- --ingress-class=public
|
||||
@ -73,7 +73,7 @@ spec:
|
||||
- NET_BIND_SERVICE
|
||||
drop:
|
||||
- ALL
|
||||
runAsUser: 33 # www-data
|
||||
runAsUser: 101 # www-data
|
||||
restartPolicy: Always
|
||||
terminationGracePeriodSeconds: 300
|
||||
|
||||
|
@ -22,7 +22,7 @@ spec:
|
||||
spec:
|
||||
containers:
|
||||
- name: nginx-ingress-controller
|
||||
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.26.1
|
||||
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.30.0
|
||||
args:
|
||||
- /nginx-ingress-controller
|
||||
- --ingress-class=public
|
||||
@ -76,6 +76,6 @@ spec:
|
||||
- NET_BIND_SERVICE
|
||||
drop:
|
||||
- ALL
|
||||
runAsUser: 33 # www-data
|
||||
runAsUser: 101 # www-data
|
||||
restartPolicy: Always
|
||||
terminationGracePeriodSeconds: 300
|
||||
|
@ -22,7 +22,7 @@ spec:
|
||||
spec:
|
||||
containers:
|
||||
- name: nginx-ingress-controller
|
||||
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.26.1
|
||||
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.30.0
|
||||
args:
|
||||
- /nginx-ingress-controller
|
||||
- --ingress-class=public
|
||||
@ -76,6 +76,6 @@ spec:
|
||||
- NET_BIND_SERVICE
|
||||
drop:
|
||||
- ALL
|
||||
runAsUser: 33 # www-data
|
||||
runAsUser: 101 # www-data
|
||||
restartPolicy: Always
|
||||
terminationGracePeriodSeconds: 300
|
||||
|
@ -20,7 +20,7 @@ spec:
|
||||
serviceAccountName: prometheus
|
||||
containers:
|
||||
- name: prometheus
|
||||
image: quay.io/prometheus/prometheus:v2.14.0
|
||||
image: quay.io/prometheus/prometheus:v2.17.1
|
||||
args:
|
||||
- --web.listen-address=0.0.0.0:9090
|
||||
- --config.file=/etc/prometheus/prometheus.yaml
|
||||
|
@ -1,3 +1,4 @@
|
||||
# Allow Prometheus to scrape service endpoints
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
@ -7,7 +8,6 @@ metadata:
|
||||
prometheus.io/scrape: 'true'
|
||||
spec:
|
||||
type: ClusterIP
|
||||
# service is created to allow prometheus to scrape endpoints
|
||||
clusterIP: None
|
||||
selector:
|
||||
k8s-app: kube-controller-manager
|
||||
|
19
addons/prometheus/discovery/kube-proxy.yaml
Normal file
19
addons/prometheus/discovery/kube-proxy.yaml
Normal file
@ -0,0 +1,19 @@
|
||||
# Allow Prometheus to scrape service endpoints
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: kube-proxy
|
||||
namespace: kube-system
|
||||
annotations:
|
||||
prometheus.io/scrape: 'true'
|
||||
prometheus.io/port: '10249'
|
||||
spec:
|
||||
type: ClusterIP
|
||||
clusterIP: None
|
||||
selector:
|
||||
k8s-app: kube-proxy
|
||||
ports:
|
||||
- name: metrics
|
||||
protocol: TCP
|
||||
port: 10249
|
||||
targetPort: 10249
|
@ -1,3 +1,4 @@
|
||||
# Allow Prometheus to scrape service endpoints
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
@ -7,7 +8,6 @@ metadata:
|
||||
prometheus.io/scrape: 'true'
|
||||
spec:
|
||||
type: ClusterIP
|
||||
# service is created to allow prometheus to scrape endpoints
|
||||
clusterIP: None
|
||||
selector:
|
||||
k8s-app: kube-scheduler
|
||||
|
@ -74,6 +74,7 @@ rules:
|
||||
- storage.k8s.io
|
||||
resources:
|
||||
- storageclasses
|
||||
- volumeattachments
|
||||
verbs:
|
||||
- list
|
||||
- watch
|
||||
@ -84,4 +85,19 @@ rules:
|
||||
verbs:
|
||||
- list
|
||||
- watch
|
||||
- apiGroups:
|
||||
- admissionregistration.k8s.io
|
||||
resources:
|
||||
- mutatingwebhookconfigurations
|
||||
- validatingwebhookconfigurations
|
||||
verbs:
|
||||
- list
|
||||
- watch
|
||||
- apiGroups:
|
||||
- networking.k8s.io
|
||||
resources:
|
||||
- networkpolicies
|
||||
verbs:
|
||||
- list
|
||||
- watch
|
||||
|
||||
|
@ -24,7 +24,7 @@ spec:
|
||||
serviceAccountName: kube-state-metrics
|
||||
containers:
|
||||
- name: kube-state-metrics
|
||||
image: quay.io/coreos/kube-state-metrics:v1.8.0
|
||||
image: quay.io/coreos/kube-state-metrics:v1.9.5
|
||||
ports:
|
||||
- name: metrics
|
||||
containerPort: 8080
|
||||
|
@ -28,7 +28,7 @@ spec:
|
||||
hostPID: true
|
||||
containers:
|
||||
- name: node-exporter
|
||||
image: quay.io/prometheus/node-exporter:v0.18.1
|
||||
image: quay.io/prometheus/node-exporter:v1.0.0-rc.0
|
||||
args:
|
||||
- --path.procfs=/host/proc
|
||||
- --path.sysfs=/host/sys
|
||||
@ -57,7 +57,9 @@ spec:
|
||||
mountPath: /host/root
|
||||
readOnly: true
|
||||
tolerations:
|
||||
- effect: NoSchedule
|
||||
- key: node-role.kubernetes.io/master
|
||||
operator: Exists
|
||||
- key: node.kubernetes.io/not-ready
|
||||
operator: Exists
|
||||
volumes:
|
||||
- name: proc
|
||||
|
@ -42,10 +42,10 @@ data:
|
||||
{
|
||||
"alert": "etcdHighNumberOfLeaderChanges",
|
||||
"annotations": {
|
||||
"message": "etcd cluster \"{{ $labels.job }}\": instance {{ $labels.instance }} has seen {{ $value }} leader changes within the last 30 minutes."
|
||||
"message": "etcd cluster \"{{ $labels.job }}\": {{ $value }} leader changes within the last 15 minutes. Frequent elections may be a sign of insufficient resources, high network latency, or disruptions by other components and should be investigated."
|
||||
},
|
||||
"expr": "rate(etcd_server_leader_changes_seen_total{job=~\".*etcd.*\"}[15m]) > 3\n",
|
||||
"for": "15m",
|
||||
"expr": "increase((max by (job) (etcd_server_leader_changes_seen_total{job=~\".*etcd.*\"}) or 0*absent(etcd_server_leader_changes_seen_total{job=~\".*etcd.*\"}))[15m:1m]) >= 3\n",
|
||||
"for": "5m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
@ -145,25 +145,132 @@ data:
|
||||
kube.yaml: |-
|
||||
{
|
||||
"groups": [
|
||||
{
|
||||
"name": "kube-apiserver-error",
|
||||
"rules": [
|
||||
{
|
||||
"expr": "sum by (status_class) (\n label_replace(\n rate(apiserver_request_total{job=\"apiserver\"}[5m]\n ), \"status_class\", \"${1}xx\", \"code\", \"([0-9])..\")\n)\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class:apiserver_request_total:rate5m"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (status_class) (\n label_replace(\n rate(apiserver_request_total{job=\"apiserver\"}[30m]\n ), \"status_class\", \"${1}xx\", \"code\", \"([0-9])..\")\n)\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class:apiserver_request_total:rate30m"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (status_class) (\n label_replace(\n rate(apiserver_request_total{job=\"apiserver\"}[1h]\n ), \"status_class\", \"${1}xx\", \"code\", \"([0-9])..\")\n)\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class:apiserver_request_total:rate1h"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (status_class) (\n label_replace(\n rate(apiserver_request_total{job=\"apiserver\"}[2h]\n ), \"status_class\", \"${1}xx\", \"code\", \"([0-9])..\")\n)\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class:apiserver_request_total:rate2h"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (status_class) (\n label_replace(\n rate(apiserver_request_total{job=\"apiserver\"}[6h]\n ), \"status_class\", \"${1}xx\", \"code\", \"([0-9])..\")\n)\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class:apiserver_request_total:rate6h"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (status_class) (\n label_replace(\n rate(apiserver_request_total{job=\"apiserver\"}[1d]\n ), \"status_class\", \"${1}xx\", \"code\", \"([0-9])..\")\n)\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class:apiserver_request_total:rate1d"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (status_class) (\n label_replace(\n rate(apiserver_request_total{job=\"apiserver\"}[3d]\n ), \"status_class\", \"${1}xx\", \"code\", \"([0-9])..\")\n)\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class:apiserver_request_total:rate3d"
|
||||
},
|
||||
{
|
||||
"expr": "sum(status_class:apiserver_request_total:rate5m{job=\"apiserver\",status_class=\"5xx\"})\n/\nsum(status_class:apiserver_request_total:rate5m{job=\"apiserver\"})\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class_5xx:apiserver_request_total:ratio_rate5m"
|
||||
},
|
||||
{
|
||||
"expr": "sum(status_class:apiserver_request_total:rate30m{job=\"apiserver\",status_class=\"5xx\"})\n/\nsum(status_class:apiserver_request_total:rate30m{job=\"apiserver\"})\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class_5xx:apiserver_request_total:ratio_rate30m"
|
||||
},
|
||||
{
|
||||
"expr": "sum(status_class:apiserver_request_total:rate1h{job=\"apiserver\",status_class=\"5xx\"})\n/\nsum(status_class:apiserver_request_total:rate1h{job=\"apiserver\"})\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class_5xx:apiserver_request_total:ratio_rate1h"
|
||||
},
|
||||
{
|
||||
"expr": "sum(status_class:apiserver_request_total:rate2h{job=\"apiserver\",status_class=\"5xx\"})\n/\nsum(status_class:apiserver_request_total:rate2h{job=\"apiserver\"})\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class_5xx:apiserver_request_total:ratio_rate2h"
|
||||
},
|
||||
{
|
||||
"expr": "sum(status_class:apiserver_request_total:rate6h{job=\"apiserver\",status_class=\"5xx\"})\n/\nsum(status_class:apiserver_request_total:rate6h{job=\"apiserver\"})\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class_5xx:apiserver_request_total:ratio_rate6h"
|
||||
},
|
||||
{
|
||||
"expr": "sum(status_class:apiserver_request_total:rate1d{job=\"apiserver\",status_class=\"5xx\"})\n/\nsum(status_class:apiserver_request_total:rate1d{job=\"apiserver\"})\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class_5xx:apiserver_request_total:ratio_rate1d"
|
||||
},
|
||||
{
|
||||
"expr": "sum(status_class:apiserver_request_total:rate3d{job=\"apiserver\",status_class=\"5xx\"})\n/\nsum(status_class:apiserver_request_total:rate3d{job=\"apiserver\"})\n",
|
||||
"labels": {
|
||||
"job": "apiserver"
|
||||
},
|
||||
"record": "status_class_5xx:apiserver_request_total:ratio_rate3d"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "kube-apiserver.rules",
|
||||
"rules": [
|
||||
{
|
||||
"expr": "histogram_quantile(0.99, sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\"}[5m])) without(instance, pod))\n",
|
||||
"expr": "sum(rate(apiserver_request_duration_seconds_sum{subresource!=\"log\",verb!~\"LIST|WATCH|WATCHLIST|PROXY|CONNECT\"}[5m])) without(instance, pod)\n/\nsum(rate(apiserver_request_duration_seconds_count{subresource!=\"log\",verb!~\"LIST|WATCH|WATCHLIST|PROXY|CONNECT\"}[5m])) without(instance, pod)\n",
|
||||
"record": "cluster:apiserver_request_duration_seconds:mean5m"
|
||||
},
|
||||
{
|
||||
"expr": "histogram_quantile(0.99, sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",subresource!=\"log\",verb!~\"LIST|WATCH|WATCHLIST|PROXY|CONNECT\"}[5m])) without(instance, pod))\n",
|
||||
"labels": {
|
||||
"quantile": "0.99"
|
||||
},
|
||||
"record": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile"
|
||||
},
|
||||
{
|
||||
"expr": "histogram_quantile(0.9, sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\"}[5m])) without(instance, pod))\n",
|
||||
"expr": "histogram_quantile(0.9, sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",subresource!=\"log\",verb!~\"LIST|WATCH|WATCHLIST|PROXY|CONNECT\"}[5m])) without(instance, pod))\n",
|
||||
"labels": {
|
||||
"quantile": "0.9"
|
||||
},
|
||||
"record": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile"
|
||||
},
|
||||
{
|
||||
"expr": "histogram_quantile(0.5, sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\"}[5m])) without(instance, pod))\n",
|
||||
"expr": "histogram_quantile(0.5, sum(rate(apiserver_request_duration_seconds_bucket{job=\"apiserver\",subresource!=\"log\",verb!~\"LIST|WATCH|WATCHLIST|PROXY|CONNECT\"}[5m])) without(instance, pod))\n",
|
||||
"labels": {
|
||||
"quantile": "0.5"
|
||||
},
|
||||
@ -179,23 +286,23 @@ data:
|
||||
"record": "namespace:container_cpu_usage_seconds_total:sum_rate"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (namespace, pod, container) (\n rate(container_cpu_usage_seconds_total{job=\"kubernetes-cadvisor\", image!=\"\", container!=\"POD\"}[5m])\n) * on (namespace, pod) group_left(node) max by(namespace, pod, node) (kube_pod_info)\n",
|
||||
"expr": "sum by (cluster, namespace, pod, container) (\n rate(container_cpu_usage_seconds_total{job=\"kubernetes-cadvisor\", image!=\"\", container!=\"POD\"}[5m])\n) * on (cluster, namespace, pod) group_left(node) topk by (cluster, namespace, pod) (\n 1, max by(cluster, namespace, pod, node) (kube_pod_info)\n)\n",
|
||||
"record": "node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate"
|
||||
},
|
||||
{
|
||||
"expr": "container_memory_working_set_bytes{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) max by(namespace, pod, node) (kube_pod_info)\n",
|
||||
"expr": "container_memory_working_set_bytes{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) topk by(namespace, pod) (1,\n max by(namespace, pod, node) (kube_pod_info)\n)\n",
|
||||
"record": "node_namespace_pod_container:container_memory_working_set_bytes"
|
||||
},
|
||||
{
|
||||
"expr": "container_memory_rss{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) max by(namespace, pod, node) (kube_pod_info)\n",
|
||||
"expr": "container_memory_rss{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) topk by(namespace, pod) (1,\n max by(namespace, pod, node) (kube_pod_info)\n)\n",
|
||||
"record": "node_namespace_pod_container:container_memory_rss"
|
||||
},
|
||||
{
|
||||
"expr": "container_memory_cache{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) max by(namespace, pod, node) (kube_pod_info)\n",
|
||||
"expr": "container_memory_cache{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) topk by(namespace, pod) (1,\n max by(namespace, pod, node) (kube_pod_info)\n)\n",
|
||||
"record": "node_namespace_pod_container:container_memory_cache"
|
||||
},
|
||||
{
|
||||
"expr": "container_memory_swap{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) max by(namespace, pod, node) (kube_pod_info)\n",
|
||||
"expr": "container_memory_swap{job=\"kubernetes-cadvisor\", image!=\"\"}\n* on (namespace, pod) group_left(node) topk by(namespace, pod) (1,\n max by(namespace, pod, node) (kube_pod_info)\n)\n",
|
||||
"record": "node_namespace_pod_container:container_memory_swap"
|
||||
},
|
||||
{
|
||||
@ -203,29 +310,29 @@ data:
|
||||
"record": "namespace:container_memory_usage_bytes:sum"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (namespace, label_name) (\n sum(kube_pod_container_resource_requests_memory_bytes{job=\"kube-state-metrics\"} * on (endpoint, instance, job, namespace, pod, service) group_left(phase) (kube_pod_status_phase{phase=~\"Pending|Running\"} == 1)) by (namespace, pod)\n * on (namespace, pod)\n group_left(label_name) kube_pod_labels{job=\"kube-state-metrics\"}\n)\n",
|
||||
"expr": "sum by (namespace) (\n sum by (namespace, pod) (\n max by (namespace, pod, container) (\n kube_pod_container_resource_requests_memory_bytes{job=\"kube-state-metrics\"}\n ) * on(namespace, pod) group_left() max by (namespace, pod) (\n kube_pod_status_phase{phase=~\"Pending|Running\"} == 1\n )\n )\n)\n",
|
||||
"record": "namespace:kube_pod_container_resource_requests_memory_bytes:sum"
|
||||
},
|
||||
{
|
||||
"expr": "sum by (namespace, label_name) (\n sum(kube_pod_container_resource_requests_cpu_cores{job=\"kube-state-metrics\"} * on (endpoint, instance, job, namespace, pod, service) group_left(phase) (kube_pod_status_phase{phase=~\"Pending|Running\"} == 1)) by (namespace, pod)\n * on (namespace, pod)\n group_left(label_name) kube_pod_labels{job=\"kube-state-metrics\"}\n)\n",
|
||||
"expr": "sum by (namespace) (\n sum by (namespace, pod) (\n max by (namespace, pod, container) (\n kube_pod_container_resource_requests_cpu_cores{job=\"kube-state-metrics\"}\n ) * on(namespace, pod) group_left() max by (namespace, pod) (\n kube_pod_status_phase{phase=~\"Pending|Running\"} == 1\n )\n )\n)\n",
|
||||
"record": "namespace:kube_pod_container_resource_requests_cpu_cores:sum"
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n label_replace(\n label_replace(\n kube_pod_owner{job=\"kube-state-metrics\", owner_kind=\"ReplicaSet\"},\n \"replicaset\", \"$1\", \"owner_name\", \"(.*)\"\n ) * on(replicaset, namespace) group_left(owner_name) kube_replicaset_owner{job=\"kube-state-metrics\"},\n \"workload\", \"$1\", \"owner_name\", \"(.*)\"\n )\n) by (namespace, workload, pod)\n",
|
||||
"expr": "max by (cluster, namespace, workload, pod) (\n label_replace(\n label_replace(\n kube_pod_owner{job=\"kube-state-metrics\", owner_kind=\"ReplicaSet\"},\n \"replicaset\", \"$1\", \"owner_name\", \"(.*)\"\n ) * on(replicaset, namespace) group_left(owner_name) topk by(replicaset, namespace) (\n 1, max by (replicaset, namespace, owner_name) (\n kube_replicaset_owner{job=\"kube-state-metrics\"}\n )\n ),\n \"workload\", \"$1\", \"owner_name\", \"(.*)\"\n )\n)\n",
|
||||
"labels": {
|
||||
"workload_type": "deployment"
|
||||
},
|
||||
"record": "mixin_pod_workload"
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n label_replace(\n kube_pod_owner{job=\"kube-state-metrics\", owner_kind=\"DaemonSet\"},\n \"workload\", \"$1\", \"owner_name\", \"(.*)\"\n )\n) by (namespace, workload, pod)\n",
|
||||
"expr": "max by (cluster, namespace, workload, pod) (\n label_replace(\n kube_pod_owner{job=\"kube-state-metrics\", owner_kind=\"DaemonSet\"},\n \"workload\", \"$1\", \"owner_name\", \"(.*)\"\n )\n)\n",
|
||||
"labels": {
|
||||
"workload_type": "daemonset"
|
||||
},
|
||||
"record": "mixin_pod_workload"
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n label_replace(\n kube_pod_owner{job=\"kube-state-metrics\", owner_kind=\"StatefulSet\"},\n \"workload\", \"$1\", \"owner_name\", \"(.*)\"\n )\n) by (namespace, workload, pod)\n",
|
||||
"expr": "max by (cluster, namespace, workload, pod) (\n label_replace(\n kube_pod_owner{job=\"kube-state-metrics\", owner_kind=\"StatefulSet\"},\n \"workload\", \"$1\", \"owner_name\", \"(.*)\"\n )\n)\n",
|
||||
"labels": {
|
||||
"workload_type": "statefulset"
|
||||
},
|
||||
@ -305,23 +412,49 @@ data:
|
||||
"name": "node.rules",
|
||||
"rules": [
|
||||
{
|
||||
"expr": "sum(min(kube_pod_info) by (node))",
|
||||
"expr": "sum(min(kube_pod_info) by (cluster, node))\n",
|
||||
"record": ":kube_pod_info_node_count:"
|
||||
},
|
||||
{
|
||||
"expr": "max(label_replace(kube_pod_info{job=\"kube-state-metrics\"}, \"pod\", \"$1\", \"pod\", \"(.*)\")) by (node, namespace, pod)\n",
|
||||
"expr": "topk by(namespace, pod) (1,\n max by (node, namespace, pod) (\n label_replace(kube_pod_info{job=\"kube-state-metrics\"}, \"pod\", \"$1\", \"pod\", \"(.*)\")\n))\n",
|
||||
"record": "node_namespace_pod:kube_pod_info:"
|
||||
},
|
||||
{
|
||||
"expr": "count by (node) (sum by (node, cpu) (\n node_cpu_seconds_total{job=\"node-exporter\"}\n* on (namespace, pod) group_left(node)\n node_namespace_pod:kube_pod_info:\n))\n",
|
||||
"expr": "count by (cluster, node) (sum by (node, cpu) (\n node_cpu_seconds_total{job=\"node-exporter\"}\n* on (namespace, pod) group_left(node)\n node_namespace_pod:kube_pod_info:\n))\n",
|
||||
"record": "node:node_num_cpu:sum"
|
||||
},
|
||||
{
|
||||
"expr": "sum(\n node_memory_MemAvailable_bytes{job=\"node-exporter\"} or\n (\n node_memory_Buffers_bytes{job=\"node-exporter\"} +\n node_memory_Cached_bytes{job=\"node-exporter\"} +\n node_memory_MemFree_bytes{job=\"node-exporter\"} +\n node_memory_Slab_bytes{job=\"node-exporter\"}\n )\n)\n",
|
||||
"expr": "sum(\n node_memory_MemAvailable_bytes{job=\"node-exporter\"} or\n (\n node_memory_Buffers_bytes{job=\"node-exporter\"} +\n node_memory_Cached_bytes{job=\"node-exporter\"} +\n node_memory_MemFree_bytes{job=\"node-exporter\"} +\n node_memory_Slab_bytes{job=\"node-exporter\"}\n )\n) by (cluster)\n",
|
||||
"record": ":node_memory_MemAvailable_bytes:sum"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "kubelet.rules",
|
||||
"rules": [
|
||||
{
|
||||
"expr": "histogram_quantile(0.99, sum(rate(kubelet_pleg_relist_duration_seconds_bucket[5m])) by (instance, le) * on(instance) group_left(node) kubelet_node_name{job=\"kubelet\"})\n",
|
||||
"labels": {
|
||||
"quantile": "0.99"
|
||||
},
|
||||
"record": "node_quantile:kubelet_pleg_relist_duration_seconds:histogram_quantile"
|
||||
},
|
||||
{
|
||||
"expr": "histogram_quantile(0.9, sum(rate(kubelet_pleg_relist_duration_seconds_bucket[5m])) by (instance, le) * on(instance) group_left(node) kubelet_node_name{job=\"kubelet\"})\n",
|
||||
"labels": {
|
||||
"quantile": "0.9"
|
||||
},
|
||||
"record": "node_quantile:kubelet_pleg_relist_duration_seconds:histogram_quantile"
|
||||
},
|
||||
{
|
||||
"expr": "histogram_quantile(0.5, sum(rate(kubelet_pleg_relist_duration_seconds_bucket[5m])) by (instance, le) * on(instance) group_left(node) kubelet_node_name{job=\"kubelet\"})\n",
|
||||
"labels": {
|
||||
"quantile": "0.5"
|
||||
},
|
||||
"record": "node_quantile:kubelet_pleg_relist_duration_seconds:histogram_quantile"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "kubernetes-apps",
|
||||
"rules": [
|
||||
@ -343,7 +476,7 @@ data:
|
||||
"message": "Pod {{ $labels.namespace }}/{{ $labels.pod }} has been in a non-ready state for longer than 15 minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodnotready"
|
||||
},
|
||||
"expr": "sum by (namespace, pod) (kube_pod_status_phase{job=\"kube-state-metrics\", phase=~\"Failed|Pending|Unknown\"} * on(namespace, pod) group_left(owner_kind) kube_pod_owner{owner_kind!=\"Job\"}) > 0\n",
|
||||
"expr": "sum by (namespace, pod) (max by(namespace, pod) (kube_pod_status_phase{job=\"kube-state-metrics\", phase=~\"Pending|Unknown\"}) * on(namespace, pod) group_left(owner_kind) max by(namespace, pod, owner_kind) (kube_pod_owner{owner_kind!=\"Job\"})) > 0\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
@ -367,7 +500,7 @@ data:
|
||||
"message": "Deployment {{ $labels.namespace }}/{{ $labels.deployment }} has not matched the expected number of replicas for longer than 15 minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedeploymentreplicasmismatch"
|
||||
},
|
||||
"expr": "kube_deployment_spec_replicas{job=\"kube-state-metrics\"}\n !=\nkube_deployment_status_replicas_available{job=\"kube-state-metrics\"}\n",
|
||||
"expr": "(\n kube_deployment_spec_replicas{job=\"kube-state-metrics\"}\n !=\n kube_deployment_status_replicas_available{job=\"kube-state-metrics\"}\n) and (\n changes(kube_deployment_status_replicas_updated{job=\"kube-state-metrics\"}[5m])\n ==\n 0\n)\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
@ -379,7 +512,7 @@ data:
|
||||
"message": "StatefulSet {{ $labels.namespace }}/{{ $labels.statefulset }} has not matched the expected number of replicas for longer than 15 minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubestatefulsetreplicasmismatch"
|
||||
},
|
||||
"expr": "kube_statefulset_status_replicas_ready{job=\"kube-state-metrics\"}\n !=\nkube_statefulset_status_replicas{job=\"kube-state-metrics\"}\n",
|
||||
"expr": "(\n kube_statefulset_status_replicas_ready{job=\"kube-state-metrics\"}\n !=\n kube_statefulset_status_replicas{job=\"kube-state-metrics\"}\n) and (\n changes(kube_statefulset_status_replicas_updated{job=\"kube-state-metrics\"}[5m])\n ==\n 0\n)\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
@ -528,7 +661,7 @@ data:
|
||||
"message": "Cluster has overcommitted CPU resource requests for Pods and cannot tolerate node failure.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecpuovercommit"
|
||||
},
|
||||
"expr": "sum(namespace:kube_pod_container_resource_requests_cpu_cores:sum)\n /\nsum(kube_node_status_allocatable_cpu_cores)\n >\n(count(kube_node_status_allocatable_cpu_cores)-1) / count(kube_node_status_allocatable_cpu_cores)\n",
|
||||
"expr": "sum(namespace:kube_pod_container_resource_requests_cpu_cores:sum{})\n /\nsum(kube_node_status_allocatable_cpu_cores)\n >\n(count(kube_node_status_allocatable_cpu_cores)-1) / count(kube_node_status_allocatable_cpu_cores)\n",
|
||||
"for": "5m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
@ -540,7 +673,7 @@ data:
|
||||
"message": "Cluster has overcommitted memory resource requests for Pods and cannot tolerate node failure.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubememovercommit"
|
||||
},
|
||||
"expr": "sum(namespace:kube_pod_container_resource_requests_memory_bytes:sum)\n /\nsum(kube_node_status_allocatable_memory_bytes)\n >\n(count(kube_node_status_allocatable_memory_bytes)-1)\n /\ncount(kube_node_status_allocatable_memory_bytes)\n",
|
||||
"expr": "sum(namespace:kube_pod_container_resource_requests_memory_bytes:sum{})\n /\nsum(kube_node_status_allocatable_memory_bytes)\n >\n(count(kube_node_status_allocatable_memory_bytes)-1)\n /\ncount(kube_node_status_allocatable_memory_bytes)\n",
|
||||
"for": "5m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
@ -618,7 +751,7 @@ data:
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepersistentvolumefullinfourdays"
|
||||
},
|
||||
"expr": "(\n kubelet_volume_stats_available_bytes{job=\"kubelet\"}\n /\n kubelet_volume_stats_capacity_bytes{job=\"kubelet\"}\n) < 0.15\nand\npredict_linear(kubelet_volume_stats_available_bytes{job=\"kubelet\"}[6h], 4 * 24 * 3600) < 0\n",
|
||||
"for": "5m",
|
||||
"for": "1h",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
}
|
||||
@ -666,17 +799,44 @@ data:
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "kube-apiserver-error-alerts",
|
||||
"rules": [
|
||||
{
|
||||
"alert": "ErrorBudgetBurn",
|
||||
"annotations": {
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-errorbudgetburn"
|
||||
},
|
||||
"expr": "(\n status_class_5xx:apiserver_request_total:ratio_rate1h{job=\"apiserver\"} > (14.4*0.010000)\n and\n status_class_5xx:apiserver_request_total:ratio_rate5m{job=\"apiserver\"} > (14.4*0.010000)\n)\nor\n(\n status_class_5xx:apiserver_request_total:ratio_rate6h{job=\"apiserver\"} > (6*0.010000)\n and\n status_class_5xx:apiserver_request_total:ratio_rate30m{job=\"apiserver\"} > (6*0.010000)\n)\n",
|
||||
"labels": {
|
||||
"job": "apiserver",
|
||||
"severity": "critical"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "ErrorBudgetBurn",
|
||||
"annotations": {
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-errorbudgetburn"
|
||||
},
|
||||
"expr": "(\n status_class_5xx:apiserver_request_total:ratio_rate1d{job=\"apiserver\"} > (3*0.010000)\n and\n status_class_5xx:apiserver_request_total:ratio_rate2h{job=\"apiserver\"} > (3*0.010000)\n)\nor\n(\n status_class_5xx:apiserver_request_total:ratio_rate3d{job=\"apiserver\"} > (0.010000)\n and\n status_class_5xx:apiserver_request_total:ratio_rate6h{job=\"apiserver\"} > (0.010000)\n)\n",
|
||||
"labels": {
|
||||
"job": "apiserver",
|
||||
"severity": "warning"
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "kubernetes-system-apiserver",
|
||||
"rules": [
|
||||
{
|
||||
"alert": "KubeAPILatencyHigh",
|
||||
"annotations": {
|
||||
"message": "The API server has a 99th percentile latency of {{ $value }} seconds for {{ $labels.verb }} {{ $labels.resource }}.",
|
||||
"message": "The API server has an abnormal latency of {{ $value }} seconds for {{ $labels.verb }} {{ $labels.resource }}.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapilatencyhigh"
|
||||
},
|
||||
"expr": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile{job=\"apiserver\",quantile=\"0.99\",subresource!=\"log\",verb!~\"LIST|WATCH|WATCHLIST|PROXY|CONNECT\"} > 1\n",
|
||||
"for": "10m",
|
||||
"expr": "(\n cluster:apiserver_request_duration_seconds:mean5m{job=\"apiserver\"}\n >\n on (verb) group_left()\n (\n avg by (verb) (cluster:apiserver_request_duration_seconds:mean5m{job=\"apiserver\"} >= 0)\n +\n 2*stddev by (verb) (cluster:apiserver_request_duration_seconds:mean5m{job=\"apiserver\"} >= 0)\n )\n) > on (verb) group_left()\n1.2 * avg by (verb) (cluster:apiserver_request_duration_seconds:mean5m{job=\"apiserver\"} >= 0)\nand on (verb,resource)\ncluster_quantile:apiserver_request_duration_seconds:histogram_quantile{job=\"apiserver\",quantile=\"0.99\"}\n>\n1\n",
|
||||
"for": "5m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
@ -687,7 +847,7 @@ data:
|
||||
"message": "The API server has a 99th percentile latency of {{ $value }} seconds for {{ $labels.verb }} {{ $labels.resource }}.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapilatencyhigh"
|
||||
},
|
||||
"expr": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile{job=\"apiserver\",quantile=\"0.99\",subresource!=\"log\",verb!~\"LIST|WATCH|WATCHLIST|PROXY|CONNECT\"} > 4\n",
|
||||
"expr": "cluster_quantile:apiserver_request_duration_seconds:histogram_quantile{job=\"apiserver\",quantile=\"0.99\"} > 4\n",
|
||||
"for": "10m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
@ -747,7 +907,7 @@ data:
|
||||
"message": "A client certificate used to authenticate to the apiserver is expiring in less than 7.0 days.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclientcertificateexpiration"
|
||||
},
|
||||
"expr": "apiserver_client_certificate_expiration_seconds_count{job=\"apiserver\"} > 0 and histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job=\"apiserver\"}[5m]))) < 604800\n",
|
||||
"expr": "apiserver_client_certificate_expiration_seconds_count{job=\"apiserver\"} > 0 and on(job) histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job=\"apiserver\"}[5m]))) < 604800\n",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
@ -758,11 +918,34 @@ data:
|
||||
"message": "A client certificate used to authenticate to the apiserver is expiring in less than 24.0 hours.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclientcertificateexpiration"
|
||||
},
|
||||
"expr": "apiserver_client_certificate_expiration_seconds_count{job=\"apiserver\"} > 0 and histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job=\"apiserver\"}[5m]))) < 86400\n",
|
||||
"expr": "apiserver_client_certificate_expiration_seconds_count{job=\"apiserver\"} > 0 and on(job) histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job=\"apiserver\"}[5m]))) < 86400\n",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "AggregatedAPIErrors",
|
||||
"annotations": {
|
||||
"message": "An aggregated API {{ $labels.name }}/{{ $labels.namespace }} has reported errors. The number of errors have increased for it in the past five minutes. High values indicate that the availability of the service changes too often.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-aggregatedapierrors"
|
||||
},
|
||||
"expr": "sum by(name, namespace)(increase(aggregator_unavailable_apiservice_count[5m])) > 2\n",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "AggregatedAPIDown",
|
||||
"annotations": {
|
||||
"message": "An aggregated API {{ $labels.name }}/{{ $labels.namespace }} is down. It has not been available at least for the past five minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-aggregatedapidown"
|
||||
},
|
||||
"expr": "sum by(name, namespace)(sum_over_time(aggregator_unavailable_apiservice[5m])) > 0\n",
|
||||
"for": "5m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeAPIDown",
|
||||
"annotations": {
|
||||
@ -799,6 +982,7 @@ data:
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubenodeunreachable"
|
||||
},
|
||||
"expr": "kube_node_spec_taint{job=\"kube-state-metrics\",key=\"node.kubernetes.io/unreachable\",effect=\"NoSchedule\"} == 1\n",
|
||||
"for": "2m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
@ -815,6 +999,42 @@ data:
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeNodeReadinessFlapping",
|
||||
"annotations": {
|
||||
"message": "The readiness status of node {{ $labels.node }} has changed {{ $value }} times in the last 15 minutes.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubenodereadinessflapping"
|
||||
},
|
||||
"expr": "sum(changes(kube_node_status_condition{status=\"true\",condition=\"Ready\"}[15m])) by (node) > 2\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeletPlegDurationHigh",
|
||||
"annotations": {
|
||||
"message": "The Kubelet Pod Lifecycle Event Generator has a 99th percentile duration of {{ $value }} seconds on node {{ $labels.node }}.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeletplegdurationhigh"
|
||||
},
|
||||
"expr": "node_quantile:kubelet_pleg_relist_duration_seconds:histogram_quantile{quantile=\"0.99\"} >= 10\n",
|
||||
"for": "5m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeletPodStartUpLatencyHigh",
|
||||
"annotations": {
|
||||
"message": "Kubelet Pod startup 99th percentile latency is {{ $value }} seconds on node {{ $labels.node }}.",
|
||||
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeletpodstartuplatencyhigh"
|
||||
},
|
||||
"expr": "histogram_quantile(0.99, sum(rate(kubelet_pod_worker_duration_seconds_bucket{job=\"kubelet\"}[5m])) by (instance, le)) * on(instance) group_left(node) kubelet_node_name > 5\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "KubeletDown",
|
||||
"annotations": {
|
||||
@ -1124,7 +1344,7 @@ data:
|
||||
{
|
||||
"alert": "PrometheusRemoteStorageFailures",
|
||||
"annotations": {
|
||||
"description": "Prometheus {{$labels.instance}} failed to send {{ printf \"%.1f\" $value }}% of the samples to queue {{$labels.queue}}.",
|
||||
"description": "Prometheus {{$labels.instance}} failed to send {{ printf \"%.1f\" $value }}% of the samples to {{ if $labels.queue }}{{ $labels.queue }}{{ else }}{{ $labels.url }}{{ end }}.",
|
||||
"summary": "Prometheus fails to send samples to remote storage."
|
||||
},
|
||||
"expr": "(\n rate(prometheus_remote_storage_failed_samples_total{job=\"prometheus\"}[5m])\n/\n (\n rate(prometheus_remote_storage_failed_samples_total{job=\"prometheus\"}[5m])\n +\n rate(prometheus_remote_storage_succeeded_samples_total{job=\"prometheus\"}[5m])\n )\n)\n* 100\n> 1\n",
|
||||
@ -1136,7 +1356,7 @@ data:
|
||||
{
|
||||
"alert": "PrometheusRemoteWriteBehind",
|
||||
"annotations": {
|
||||
"description": "Prometheus {{$labels.instance}} remote write is {{ printf \"%.1f\" $value }}s behind for queue {{$labels.queue}}.",
|
||||
"description": "Prometheus {{$labels.instance}} remote write is {{ printf \"%.1f\" $value }}s behind for {{ if $labels.queue }}{{ $labels.queue }}{{ else }}{{ $labels.url }}{{ end }}.",
|
||||
"summary": "Prometheus remote write is behind."
|
||||
},
|
||||
"expr": "# Without max_over_time, failed scrapes could create false negatives, see\n# https://www.robustperception.io/alerting-on-gauges-in-prometheus-2-0 for details.\n(\n max_over_time(prometheus_remote_storage_highest_timestamp_in_seconds{job=\"prometheus\"}[5m])\n- on(job, instance) group_right\n max_over_time(prometheus_remote_storage_queue_highest_sent_timestamp_seconds{job=\"prometheus\"}[5m])\n)\n> 120\n",
|
||||
@ -1148,10 +1368,10 @@ data:
|
||||
{
|
||||
"alert": "PrometheusRemoteWriteDesiredShards",
|
||||
"annotations": {
|
||||
"description": "Prometheus {{$labels.instance}} remote write desired shards calculation wants to run {{ printf $value }} shards, which is more than the max of {{ printf `prometheus_remote_storage_shards_max{instance=\"%s\",job=\"prometheus\"}` $labels.instance | query | first | value }}.",
|
||||
"description": "Prometheus {{$labels.instance}} remote write desired shards calculation wants to run {{ $value }} shards, which is more than the max of {{ printf `prometheus_remote_storage_shards_max{instance=\"%s\",job=\"prometheus\"}` $labels.instance | query | first | value }}.",
|
||||
"summary": "Prometheus remote write desired shards calculation wants to run more than configured max shards."
|
||||
},
|
||||
"expr": "# Without max_over_time, failed scrapes could create false negatives, see\n# https://www.robustperception.io/alerting-on-gauges-in-prometheus-2-0 for details.\n(\n max_over_time(prometheus_remote_storage_shards_desired{job=\"prometheus\"}[5m])\n> on(job, instance) group_right\n max_over_time(prometheus_remote_storage_shards_max{job=\"prometheus\"}[5m])\n)\n",
|
||||
"expr": "# Without max_over_time, failed scrapes could create false negatives, see\n# https://www.robustperception.io/alerting-on-gauges-in-prometheus-2-0 for details.\n(\n max_over_time(prometheus_remote_storage_shards_desired{job=\"prometheus\"}[5m])\n>\n max_over_time(prometheus_remote_storage_shards_max{job=\"prometheus\"}[5m])\n)\n",
|
||||
"for": "15m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
@ -1201,6 +1421,17 @@ data:
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
}
|
||||
},
|
||||
{
|
||||
"alert": "BlackboxProbeFailure",
|
||||
"annotations": {
|
||||
"message": "Blackbox probe {{$labels.instance}} failed"
|
||||
},
|
||||
"expr": "probe_success == 0",
|
||||
"for": "2m",
|
||||
"labels": {
|
||||
"severity": "critical"
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
@ -1212,7 +1443,7 @@ data:
|
||||
"annotations": {
|
||||
"message": "{{ $value }} RAID disk(s) on node {{ $labels.instance }} are inactive."
|
||||
},
|
||||
"expr": "node_md_disks - node_md_disks_active > 0",
|
||||
"expr": "node_md_disks{state=\"failed\"} > 0",
|
||||
"for": "10m",
|
||||
"labels": {
|
||||
"severity": "warning"
|
||||
|
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.17.0 (upstream)
|
||||
* Kubernetes v1.18.0 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/cl/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
|
||||
|
@ -4,8 +4,8 @@ locals {
|
||||
# flatcar-stable -> Flatcar Linux AMI
|
||||
ami_id = local.flavor == "flatcar" ? data.aws_ami.flatcar.image_id : data.aws_ami.coreos.image_id
|
||||
|
||||
flavor = element(split("-", var.os_image), 0)
|
||||
channel = element(split("-", var.os_image), 1)
|
||||
flavor = split("-", var.os_image)[0]
|
||||
channel = split("-", var.os_image)[1]
|
||||
}
|
||||
|
||||
data "aws_ami" "coreos" {
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=24e5513ee6da4888005aac02101f73fc9e5e829f"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=cb170f802d09dcbe88b050257bc676e25d3c4282"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
|
||||
|
@ -7,7 +7,9 @@ systemd:
|
||||
- name: 40-etcd-cluster.conf
|
||||
contents: |
|
||||
[Service]
|
||||
Environment="ETCD_IMAGE_TAG=v3.4.3"
|
||||
Environment="ETCD_IMAGE_TAG=v3.4.5"
|
||||
Environment="ETCD_IMAGE_URL=docker://quay.io/coreos/etcd"
|
||||
Environment="RKT_RUN_ARGS=--insecure-options=image"
|
||||
Environment="ETCD_NAME=${etcd_name}"
|
||||
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
|
||||
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
|
||||
@ -50,29 +52,46 @@ systemd:
|
||||
Description=Kubelet via Hyperkube
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
EnvironmentFile=/etc/kubernetes/kubelet.env
|
||||
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/cache/kubelet-pod.uuid \
|
||||
--volume=resolv,kind=host,source=/etc/resolv.conf \
|
||||
--mount volume=resolv,target=/etc/resolv.conf \
|
||||
--volume var-lib-cni,kind=host,source=/var/lib/cni \
|
||||
--mount volume=var-lib-cni,target=/var/lib/cni \
|
||||
--volume var-lib-calico,kind=host,source=/var/lib/calico \
|
||||
--mount volume=var-lib-calico,target=/var/lib/calico \
|
||||
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
|
||||
--mount volume=opt-cni-bin,target=/opt/cni/bin \
|
||||
--volume var-log,kind=host,source=/var/log \
|
||||
--mount volume=var-log,target=/var/log \
|
||||
--insecure-options=image"
|
||||
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/cni
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
|
||||
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
|
||||
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
|
||||
ExecStart=/usr/lib/coreos/kubelet-wrapper \
|
||||
ExecStart=/usr/bin/rkt run \
|
||||
--uuid-file-save=/var/cache/kubelet-pod.uuid \
|
||||
--stage1-from-dir=stage1-fly.aci \
|
||||
--hosts-entry host \
|
||||
--insecure-options=image \
|
||||
--volume etc-kubernetes,kind=host,source=/etc/kubernetes,readOnly=true \
|
||||
--mount volume=etc-kubernetes,target=/etc/kubernetes \
|
||||
--volume etc-machine-id,kind=host,source=/etc/machine-id,readOnly=true \
|
||||
--mount volume=etc-machine-id,target=/etc/machine-id \
|
||||
--volume etc-os-release,kind=host,source=/usr/lib/os-release,readOnly=true \
|
||||
--mount volume=etc-os-release,target=/etc/os-release \
|
||||
--volume=etc-resolv,kind=host,source=/etc/resolv.conf,readOnly=true \
|
||||
--mount volume=etc-resolv,target=/etc/resolv.conf \
|
||||
--volume etc-ssl-certs,kind=host,source=/etc/ssl/certs,readOnly=true \
|
||||
--mount volume=etc-ssl-certs,target=/etc/ssl/certs \
|
||||
--volume lib-modules,kind=host,source=/lib/modules,readOnly=true \
|
||||
--mount volume=lib-modules,target=/lib/modules \
|
||||
--volume run,kind=host,source=/run \
|
||||
--mount volume=run,target=/run \
|
||||
--volume usr-share-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true \
|
||||
--mount volume=usr-share-certs,target=/usr/share/ca-certificates \
|
||||
--volume var-lib-calico,kind=host,source=/var/lib/calico,readOnly=true \
|
||||
--mount volume=var-lib-calico,target=/var/lib/calico \
|
||||
--volume var-lib-docker,kind=host,source=/var/lib/docker \
|
||||
--mount volume=var-lib-docker,target=/var/lib/docker \
|
||||
--volume var-lib-kubelet,kind=host,source=/var/lib/kubelet,recursive=true \
|
||||
--mount volume=var-lib-kubelet,target=/var/lib/kubelet \
|
||||
--volume var-log,kind=host,source=/var/log \
|
||||
--mount volume=var-log,target=/var/log \
|
||||
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
|
||||
--mount volume=opt-cni-bin,target=/opt/cni/bin \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.0 -- \
|
||||
--anonymous-auth=false \
|
||||
--authentication-token-webhook \
|
||||
--authorization-mode=Webhook \
|
||||
@ -82,6 +101,7 @@ systemd:
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--exit-on-lock-contention \
|
||||
--healthz-port=0 \
|
||||
--kubeconfig=/etc/kubernetes/kubeconfig \
|
||||
--lock-file=/var/run/lock/kubelet.lock \
|
||||
--network-plugin=cni \
|
||||
@ -105,7 +125,6 @@ systemd:
|
||||
Type=oneshot
|
||||
RemainAfterExit=true
|
||||
WorkingDirectory=/opt/bootstrap
|
||||
ExecStartPre=-/usr/bin/bash -c 'set -x && [ -n "$(ls /opt/bootstrap/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootstrap/assets/manifests-*/* /opt/bootstrap/assets/manifests && rm -rf /opt/bootstrap/assets/manifests-*'
|
||||
ExecStart=/usr/bin/rkt run \
|
||||
--trust-keys-from-https \
|
||||
--volume config,kind=host,source=/etc/kubernetes/bootstrap-secrets \
|
||||
@ -115,7 +134,7 @@ systemd:
|
||||
--volume script,kind=host,source=/opt/bootstrap/apply \
|
||||
--mount volume=script,target=/apply \
|
||||
--insecure-options=image \
|
||||
docker://k8s.gcr.io/hyperkube:v1.17.0 \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.0 \
|
||||
--net=host \
|
||||
--dns=host \
|
||||
--exec=/apply
|
||||
@ -130,14 +149,6 @@ storage:
|
||||
contents:
|
||||
inline: |
|
||||
${kubeconfig}
|
||||
- path: /etc/kubernetes/kubelet.env
|
||||
filesystem: root
|
||||
mode: 0644
|
||||
contents:
|
||||
inline: |
|
||||
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
|
||||
KUBELET_IMAGE_TAG=v1.17.0
|
||||
KUBELET_IMAGE_ARGS="--exec=/usr/local/bin/kubelet"
|
||||
- path: /opt/bootstrap/layout
|
||||
filesystem: root
|
||||
mode: 0544
|
||||
@ -158,7 +169,9 @@ storage:
|
||||
sudo mv static-manifests/* /etc/kubernetes/manifests/
|
||||
sudo mkdir -p /opt/bootstrap/assets
|
||||
sudo mv manifests /opt/bootstrap/assets/manifests
|
||||
sudo mv manifests-networking /opt/bootstrap/assets/manifests-networking
|
||||
sudo mkdir -p /opt/bootstrap/assets/manifests/crds
|
||||
sudo mv manifests-networking/crd*.yaml /opt/bootstrap/assets/manifests/crds
|
||||
sudo mv manifests-networking/* /opt/bootstrap/assets/manifests/
|
||||
rm -rf assets auth static-manifests tls
|
||||
- path: /opt/bootstrap/apply
|
||||
filesystem: root
|
||||
@ -171,6 +184,10 @@ storage:
|
||||
echo "Waiting for static pod control plane"
|
||||
sleep 5
|
||||
done
|
||||
until kubectl apply -f /assets/manifests/crds -R; do
|
||||
echo "Retry Custom Resource Definition manifests"
|
||||
sleep 5
|
||||
done
|
||||
until kubectl apply -f /assets/manifests -R; do
|
||||
echo "Retry applying manifests"
|
||||
sleep 5
|
@ -10,7 +10,7 @@ resource "aws_route53_record" "etcds" {
|
||||
ttl = 300
|
||||
|
||||
# private IPv4 address for etcd
|
||||
records = [element(aws_instance.controllers.*.private_ip, count.index)]
|
||||
records = [aws_instance.controllers.*.private_ip[count.index]]
|
||||
}
|
||||
|
||||
# Controller instances
|
||||
@ -24,7 +24,7 @@ resource "aws_instance" "controllers" {
|
||||
instance_type = var.controller_type
|
||||
|
||||
ami = local.ami_id
|
||||
user_data = element(data.ct_config.controller-ignitions.*.rendered, count.index)
|
||||
user_data = data.ct_config.controller-ignitions.*.rendered[count.index]
|
||||
|
||||
# storage
|
||||
root_block_device {
|
||||
@ -36,7 +36,7 @@ resource "aws_instance" "controllers" {
|
||||
|
||||
# network
|
||||
associate_public_ip_address = true
|
||||
subnet_id = element(aws_subnet.public.*.id, count.index)
|
||||
subnet_id = aws_subnet.public.*.id[count.index]
|
||||
vpc_security_group_ids = [aws_security_group.controller.id]
|
||||
|
||||
lifecycle {
|
||||
@ -49,11 +49,8 @@ resource "aws_instance" "controllers" {
|
||||
|
||||
# Controller Ignition configs
|
||||
data "ct_config" "controller-ignitions" {
|
||||
count = var.controller_count
|
||||
content = element(
|
||||
data.template_file.controller-configs.*.rendered,
|
||||
count.index,
|
||||
)
|
||||
count = var.controller_count
|
||||
content = data.template_file.controller-configs.*.rendered[count.index]
|
||||
pretty_print = false
|
||||
snippets = var.controller_clc_snippets
|
||||
}
|
||||
@ -62,7 +59,7 @@ data "ct_config" "controller-ignitions" {
|
||||
data "template_file" "controller-configs" {
|
||||
count = var.controller_count
|
||||
|
||||
template = file("${path.module}/cl/controller.yaml.tmpl")
|
||||
template = file("${path.module}/cl/controller.yaml")
|
||||
|
||||
vars = {
|
||||
# Cannot use cyclic dependencies on controllers or their DNS records
|
||||
|
@ -25,21 +25,23 @@ resource "aws_internet_gateway" "gateway" {
|
||||
resource "aws_route_table" "default" {
|
||||
vpc_id = aws_vpc.network.id
|
||||
|
||||
route {
|
||||
cidr_block = "0.0.0.0/0"
|
||||
gateway_id = aws_internet_gateway.gateway.id
|
||||
}
|
||||
|
||||
route {
|
||||
ipv6_cidr_block = "::/0"
|
||||
gateway_id = aws_internet_gateway.gateway.id
|
||||
}
|
||||
|
||||
tags = {
|
||||
"Name" = var.cluster_name
|
||||
}
|
||||
}
|
||||
|
||||
resource "aws_route" "egress-ipv4" {
|
||||
route_table_id = aws_route_table.default.id
|
||||
destination_cidr_block = "0.0.0.0/0"
|
||||
gateway_id = aws_internet_gateway.gateway.id
|
||||
}
|
||||
|
||||
resource "aws_route" "egress-ipv6" {
|
||||
route_table_id = aws_route_table.default.id
|
||||
destination_ipv6_cidr_block = "::/0"
|
||||
gateway_id = aws_internet_gateway.gateway.id
|
||||
}
|
||||
|
||||
# Subnets (one per availability zone)
|
||||
|
||||
resource "aws_subnet" "public" {
|
||||
@ -62,6 +64,6 @@ resource "aws_route_table_association" "public" {
|
||||
count = length(data.aws_availability_zones.all.names)
|
||||
|
||||
route_table_id = aws_route_table.default.id
|
||||
subnet_id = element(aws_subnet.public.*.id, count.index)
|
||||
subnet_id = aws_subnet.public.*.id[count.index]
|
||||
}
|
||||
|
||||
|
@ -88,7 +88,7 @@ resource "aws_lb_target_group_attachment" "controllers" {
|
||||
count = var.controller_count
|
||||
|
||||
target_group_arn = aws_lb_target_group.controllers.arn
|
||||
target_id = element(aws_instance.controllers.*.id, count.index)
|
||||
target_id = aws_instance.controllers.*.id[count.index]
|
||||
port = 6443
|
||||
}
|
||||
|
||||
|
@ -33,6 +33,28 @@ resource "aws_security_group_rule" "controller-etcd" {
|
||||
self = true
|
||||
}
|
||||
|
||||
# Allow Prometheus to scrape etcd metrics
|
||||
resource "aws_security_group_rule" "controller-etcd-metrics" {
|
||||
security_group_id = aws_security_group.controller.id
|
||||
|
||||
type = "ingress"
|
||||
protocol = "tcp"
|
||||
from_port = 2381
|
||||
to_port = 2381
|
||||
source_security_group_id = aws_security_group.worker.id
|
||||
}
|
||||
|
||||
# Allow Prometheus to scrape kube-proxy
|
||||
resource "aws_security_group_rule" "kube-proxy-metrics" {
|
||||
security_group_id = aws_security_group.controller.id
|
||||
|
||||
type = "ingress"
|
||||
protocol = "tcp"
|
||||
from_port = 10249
|
||||
to_port = 10249
|
||||
source_security_group_id = aws_security_group.worker.id
|
||||
}
|
||||
|
||||
# Allow Prometheus to scrape kube-scheduler
|
||||
resource "aws_security_group_rule" "controller-scheduler-metrics" {
|
||||
security_group_id = aws_security_group.controller.id
|
||||
@ -55,17 +77,6 @@ resource "aws_security_group_rule" "controller-manager-metrics" {
|
||||
source_security_group_id = aws_security_group.worker.id
|
||||
}
|
||||
|
||||
# Allow Prometheus to scrape etcd metrics
|
||||
resource "aws_security_group_rule" "controller-etcd-metrics" {
|
||||
security_group_id = aws_security_group.controller.id
|
||||
|
||||
type = "ingress"
|
||||
protocol = "tcp"
|
||||
from_port = 2381
|
||||
to_port = 2381
|
||||
source_security_group_id = aws_security_group.worker.id
|
||||
}
|
||||
|
||||
resource "aws_security_group_rule" "controller-vxlan" {
|
||||
count = var.networking == "flannel" ? 1 : 0
|
||||
|
||||
@ -281,14 +292,15 @@ resource "aws_security_group_rule" "worker-node-exporter" {
|
||||
self = true
|
||||
}
|
||||
|
||||
resource "aws_security_group_rule" "ingress-health" {
|
||||
# Allow Prometheus to scrape kube-proxy
|
||||
resource "aws_security_group_rule" "worker-kube-proxy" {
|
||||
security_group_id = aws_security_group.worker.id
|
||||
|
||||
type = "ingress"
|
||||
protocol = "tcp"
|
||||
from_port = 10254
|
||||
to_port = 10254
|
||||
cidr_blocks = ["0.0.0.0/0"]
|
||||
type = "ingress"
|
||||
protocol = "tcp"
|
||||
from_port = 10249
|
||||
to_port = 10249
|
||||
self = true
|
||||
}
|
||||
|
||||
# Allow apiserver to access kubelets for exec, log, port-forward
|
||||
@ -313,6 +325,16 @@ resource "aws_security_group_rule" "worker-kubelet-self" {
|
||||
self = true
|
||||
}
|
||||
|
||||
resource "aws_security_group_rule" "ingress-health" {
|
||||
security_group_id = aws_security_group.worker.id
|
||||
|
||||
type = "ingress"
|
||||
protocol = "tcp"
|
||||
from_port = 10254
|
||||
to_port = 10254
|
||||
cidr_blocks = ["0.0.0.0/0"]
|
||||
}
|
||||
|
||||
resource "aws_security_group_rule" "worker-bgp" {
|
||||
security_group_id = aws_security_group.worker.id
|
||||
|
||||
|
@ -2,7 +2,7 @@ locals {
|
||||
# format assets for distribution
|
||||
assets_bundle = [
|
||||
# header with the unpack location
|
||||
for key, value in module.bootstrap.assets_dist:
|
||||
for key, value in module.bootstrap.assets_dist :
|
||||
format("##### %s\n%s", key, value)
|
||||
]
|
||||
}
|
||||
|
@ -4,8 +4,8 @@ locals {
|
||||
# flatcar-stable -> Flatcar Linux AMI
|
||||
ami_id = local.flavor == "flatcar" ? data.aws_ami.flatcar.image_id : data.aws_ami.coreos.image_id
|
||||
|
||||
flavor = element(split("-", var.os_image), 0)
|
||||
channel = element(split("-", var.os_image), 1)
|
||||
flavor = split("-", var.os_image)[0]
|
||||
channel = split("-", var.os_image)[1]
|
||||
}
|
||||
|
||||
data "aws_ami" "coreos" {
|
||||
|
@ -25,29 +25,46 @@ systemd:
|
||||
Description=Kubelet via Hyperkube
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
EnvironmentFile=/etc/kubernetes/kubelet.env
|
||||
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/cache/kubelet-pod.uuid \
|
||||
--volume=resolv,kind=host,source=/etc/resolv.conf \
|
||||
--mount volume=resolv,target=/etc/resolv.conf \
|
||||
--volume var-lib-cni,kind=host,source=/var/lib/cni \
|
||||
--mount volume=var-lib-cni,target=/var/lib/cni \
|
||||
--volume var-lib-calico,kind=host,source=/var/lib/calico \
|
||||
--mount volume=var-lib-calico,target=/var/lib/calico \
|
||||
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
|
||||
--mount volume=opt-cni-bin,target=/opt/cni/bin \
|
||||
--volume var-log,kind=host,source=/var/log \
|
||||
--mount volume=var-log,target=/var/log \
|
||||
--insecure-options=image"
|
||||
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/cni
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
|
||||
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
|
||||
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
|
||||
ExecStart=/usr/lib/coreos/kubelet-wrapper \
|
||||
ExecStart=/usr/bin/rkt run \
|
||||
--uuid-file-save=/var/cache/kubelet-pod.uuid \
|
||||
--stage1-from-dir=stage1-fly.aci \
|
||||
--hosts-entry host \
|
||||
--insecure-options=image \
|
||||
--volume etc-kubernetes,kind=host,source=/etc/kubernetes,readOnly=true \
|
||||
--mount volume=etc-kubernetes,target=/etc/kubernetes \
|
||||
--volume etc-machine-id,kind=host,source=/etc/machine-id,readOnly=true \
|
||||
--mount volume=etc-machine-id,target=/etc/machine-id \
|
||||
--volume etc-os-release,kind=host,source=/usr/lib/os-release,readOnly=true \
|
||||
--mount volume=etc-os-release,target=/etc/os-release \
|
||||
--volume=etc-resolv,kind=host,source=/etc/resolv.conf,readOnly=true \
|
||||
--mount volume=etc-resolv,target=/etc/resolv.conf \
|
||||
--volume etc-ssl-certs,kind=host,source=/etc/ssl/certs,readOnly=true \
|
||||
--mount volume=etc-ssl-certs,target=/etc/ssl/certs \
|
||||
--volume lib-modules,kind=host,source=/lib/modules,readOnly=true \
|
||||
--mount volume=lib-modules,target=/lib/modules \
|
||||
--volume run,kind=host,source=/run \
|
||||
--mount volume=run,target=/run \
|
||||
--volume usr-share-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true \
|
||||
--mount volume=usr-share-certs,target=/usr/share/ca-certificates \
|
||||
--volume var-lib-calico,kind=host,source=/var/lib/calico,readOnly=true \
|
||||
--mount volume=var-lib-calico,target=/var/lib/calico \
|
||||
--volume var-lib-docker,kind=host,source=/var/lib/docker \
|
||||
--mount volume=var-lib-docker,target=/var/lib/docker \
|
||||
--volume var-lib-kubelet,kind=host,source=/var/lib/kubelet,recursive=true \
|
||||
--mount volume=var-lib-kubelet,target=/var/lib/kubelet \
|
||||
--volume var-log,kind=host,source=/var/log \
|
||||
--mount volume=var-log,target=/var/log \
|
||||
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
|
||||
--mount volume=opt-cni-bin,target=/opt/cni/bin \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.0 -- \
|
||||
--anonymous-auth=false \
|
||||
--authentication-token-webhook \
|
||||
--authorization-mode=Webhook \
|
||||
@ -57,13 +74,14 @@ systemd:
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--exit-on-lock-contention \
|
||||
--healthz-port=0 \
|
||||
--kubeconfig=/etc/kubernetes/kubeconfig \
|
||||
--lock-file=/var/run/lock/kubelet.lock \
|
||||
--network-plugin=cni \
|
||||
--node-labels=node.kubernetes.io/node \
|
||||
%{ for label in split(",", node_labels) }
|
||||
%{~ for label in split(",", node_labels) ~}
|
||||
--node-labels=${label} \
|
||||
%{ endfor ~}
|
||||
%{~ endfor ~}
|
||||
--pod-manifest-path=/etc/kubernetes/manifests \
|
||||
--read-only-port=0 \
|
||||
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
|
||||
@ -92,14 +110,6 @@ storage:
|
||||
contents:
|
||||
inline: |
|
||||
${kubeconfig}
|
||||
- path: /etc/kubernetes/kubelet.env
|
||||
filesystem: root
|
||||
mode: 0644
|
||||
contents:
|
||||
inline: |
|
||||
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
|
||||
KUBELET_IMAGE_TAG=v1.17.0
|
||||
KUBELET_IMAGE_ARGS="--exec=/usr/local/bin/kubelet"
|
||||
- path: /etc/sysctl.d/max-user-watches.conf
|
||||
filesystem: root
|
||||
contents:
|
||||
@ -117,11 +127,10 @@ storage:
|
||||
--volume config,kind=host,source=/etc/kubernetes \
|
||||
--mount volume=config,target=/etc/kubernetes \
|
||||
--insecure-options=image \
|
||||
docker://k8s.gcr.io/hyperkube:v1.17.0 \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.0 \
|
||||
--net=host \
|
||||
--dns=host \
|
||||
-- \
|
||||
kubectl --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)
|
||||
--exec=/usr/local/bin/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)
|
||||
passwd:
|
||||
users:
|
||||
- name: core
|
@ -78,7 +78,7 @@ data "ct_config" "worker-ignition" {
|
||||
|
||||
# Worker Container Linux config
|
||||
data "template_file" "worker-config" {
|
||||
template = file("${path.module}/cl/worker.yaml.tmpl")
|
||||
template = file("${path.module}/cl/worker.yaml")
|
||||
|
||||
vars = {
|
||||
kubeconfig = indent(10, var.kubeconfig)
|
||||
|
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.17.0 (upstream)
|
||||
* Kubernetes v1.18.0 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/cl/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
|
||||
|
@ -15,9 +15,14 @@ data "aws_ami" "fedora-coreos" {
|
||||
|
||||
filter {
|
||||
name = "name"
|
||||
values = ["fedora-coreos-30.*.*-hvm"]
|
||||
values = ["fedora-coreos-31.*.*.*-hvm"]
|
||||
}
|
||||
|
||||
filter {
|
||||
name = "description"
|
||||
values = ["Fedora CoreOS stable*"]
|
||||
}
|
||||
|
||||
# try to filter out dev images (AWS filters can't)
|
||||
name_regex = "^fedora-coreos-30.[0-9]*.[0-9]*-hvm*"
|
||||
name_regex = "^fedora-coreos-31.[0-9]*.[0-9]*.[0-9]*-hvm*"
|
||||
}
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=24e5513ee6da4888005aac02101f73fc9e5e829f"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=cb170f802d09dcbe88b050257bc676e25d3c4282"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
|
||||
|
@ -28,7 +28,7 @@ systemd:
|
||||
--network host \
|
||||
--volume /var/lib/etcd:/var/lib/etcd:rw,Z \
|
||||
--volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \
|
||||
quay.io/coreos/etcd:v3.4.3
|
||||
quay.io/coreos/etcd:v3.4.5
|
||||
ExecStop=/usr/bin/podman stop etcd
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
@ -73,14 +73,13 @@ systemd:
|
||||
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \
|
||||
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
|
||||
--volume /etc/pki/tls/certs:/usr/share/ca-certificates:ro \
|
||||
--volume /var/lib/calico:/var/lib/calico \
|
||||
--volume /var/lib/calico:/var/lib/calico:ro \
|
||||
--volume /var/lib/docker:/var/lib/docker \
|
||||
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
|
||||
--volume /var/log:/var/log \
|
||||
--volume /var/run:/var/run \
|
||||
--volume /var/run/lock:/var/run/lock:z \
|
||||
--volume /opt/cni/bin:/opt/cni/bin:z \
|
||||
k8s.gcr.io/hyperkube:v1.17.0 kubelet \
|
||||
quay.io/poseidon/kubelet:v1.18.0 \
|
||||
--anonymous-auth=false \
|
||||
--authentication-token-webhook \
|
||||
--authorization-mode=Webhook \
|
||||
@ -92,6 +91,7 @@ systemd:
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--exit-on-lock-contention \
|
||||
--healthz-port=0 \
|
||||
--kubeconfig=/etc/kubernetes/kubeconfig \
|
||||
--lock-file=/var/run/lock/kubelet.lock \
|
||||
--network-plugin=cni \
|
||||
@ -116,14 +116,13 @@ systemd:
|
||||
Type=oneshot
|
||||
RemainAfterExit=true
|
||||
WorkingDirectory=/opt/bootstrap
|
||||
ExecStartPre=-/usr/bin/bash -c 'set -x && [ -n "$(ls /opt/bootstrap/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootstrap/assets/manifests-*/* /opt/bootstrap/assets/manifests && rm -rf /opt/bootstrap/assets/manifests-*'
|
||||
ExecStart=/usr/bin/podman run --name bootstrap \
|
||||
--network host \
|
||||
--volume /etc/kubernetes/bootstrap-secrets:/etc/kubernetes/secrets:ro,Z \
|
||||
--volume /opt/bootstrap/assets:/assets:ro,Z \
|
||||
--volume /opt/bootstrap/apply:/apply:ro,Z \
|
||||
--entrypoint=/apply \
|
||||
k8s.gcr.io/hyperkube:v1.17.0
|
||||
quay.io/poseidon/kubelet:v1.18.0
|
||||
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
|
||||
ExecStartPost=-/usr/bin/podman stop bootstrap
|
||||
storage:
|
||||
@ -155,7 +154,9 @@ storage:
|
||||
sudo mv static-manifests/* /etc/kubernetes/manifests/
|
||||
sudo mkdir -p /opt/bootstrap/assets
|
||||
sudo mv manifests /opt/bootstrap/assets/manifests
|
||||
sudo mv manifests-networking /opt/bootstrap/assets/manifests-networking
|
||||
sudo mkdir -p /opt/bootstrap/assets/manifests/crds
|
||||
sudo mv manifests-networking/crd*.yaml /opt/bootstrap/assets/manifests/crds
|
||||
sudo mv manifests-networking/* /opt/bootstrap/assets/manifests/
|
||||
rm -rf assets auth static-manifests tls
|
||||
- path: /opt/bootstrap/apply
|
||||
mode: 0544
|
||||
@ -167,14 +168,14 @@ storage:
|
||||
echo "Waiting for static pod control plane"
|
||||
sleep 5
|
||||
done
|
||||
until kubectl apply -f /assets/manifests/crds -R; do
|
||||
echo "Retry Custom Resource Definition manifests"
|
||||
sleep 5
|
||||
done
|
||||
until kubectl apply -f /assets/manifests -R; do
|
||||
echo "Retry applying manifests"
|
||||
sleep 5
|
||||
done
|
||||
- path: /etc/sysctl.d/reverse-path-filter.conf
|
||||
contents:
|
||||
inline: |
|
||||
net.ipv4.conf.all.rp_filter=1
|
||||
- path: /etc/sysctl.d/max-user-watches.conf
|
||||
contents:
|
||||
inline: |
|
||||
|
@ -25,21 +25,23 @@ resource "aws_internet_gateway" "gateway" {
|
||||
resource "aws_route_table" "default" {
|
||||
vpc_id = aws_vpc.network.id
|
||||
|
||||
route {
|
||||
cidr_block = "0.0.0.0/0"
|
||||
gateway_id = aws_internet_gateway.gateway.id
|
||||
}
|
||||
|
||||
route {
|
||||
ipv6_cidr_block = "::/0"
|
||||
gateway_id = aws_internet_gateway.gateway.id
|
||||
}
|
||||
|
||||
tags = {
|
||||
"Name" = var.cluster_name
|
||||
}
|
||||
}
|
||||
|
||||
resource "aws_route" "egress-ipv4" {
|
||||
route_table_id = aws_route_table.default.id
|
||||
destination_cidr_block = "0.0.0.0/0"
|
||||
gateway_id = aws_internet_gateway.gateway.id
|
||||
}
|
||||
|
||||
resource "aws_route" "egress-ipv6" {
|
||||
route_table_id = aws_route_table.default.id
|
||||
destination_ipv6_cidr_block = "::/0"
|
||||
gateway_id = aws_internet_gateway.gateway.id
|
||||
}
|
||||
|
||||
# Subnets (one per availability zone)
|
||||
|
||||
resource "aws_subnet" "public" {
|
||||
|
@ -44,6 +44,17 @@ resource "aws_security_group_rule" "controller-etcd-metrics" {
|
||||
source_security_group_id = aws_security_group.worker.id
|
||||
}
|
||||
|
||||
# Allow Prometheus to scrape kube-proxy
|
||||
resource "aws_security_group_rule" "kube-proxy-metrics" {
|
||||
security_group_id = aws_security_group.controller.id
|
||||
|
||||
type = "ingress"
|
||||
protocol = "tcp"
|
||||
from_port = 10249
|
||||
to_port = 10249
|
||||
source_security_group_id = aws_security_group.worker.id
|
||||
}
|
||||
|
||||
# Allow Prometheus to scrape kube-scheduler
|
||||
resource "aws_security_group_rule" "controller-scheduler-metrics" {
|
||||
security_group_id = aws_security_group.controller.id
|
||||
@ -281,14 +292,15 @@ resource "aws_security_group_rule" "worker-node-exporter" {
|
||||
self = true
|
||||
}
|
||||
|
||||
resource "aws_security_group_rule" "ingress-health" {
|
||||
# Allow Prometheus to scrape kube-proxy
|
||||
resource "aws_security_group_rule" "worker-kube-proxy" {
|
||||
security_group_id = aws_security_group.worker.id
|
||||
|
||||
type = "ingress"
|
||||
protocol = "tcp"
|
||||
from_port = 10254
|
||||
to_port = 10254
|
||||
cidr_blocks = ["0.0.0.0/0"]
|
||||
type = "ingress"
|
||||
protocol = "tcp"
|
||||
from_port = 10249
|
||||
to_port = 10249
|
||||
self = true
|
||||
}
|
||||
|
||||
# Allow apiserver to access kubelets for exec, log, port-forward
|
||||
@ -313,6 +325,16 @@ resource "aws_security_group_rule" "worker-kubelet-self" {
|
||||
self = true
|
||||
}
|
||||
|
||||
resource "aws_security_group_rule" "ingress-health" {
|
||||
security_group_id = aws_security_group.worker.id
|
||||
|
||||
type = "ingress"
|
||||
protocol = "tcp"
|
||||
from_port = 10254
|
||||
to_port = 10254
|
||||
cidr_blocks = ["0.0.0.0/0"]
|
||||
}
|
||||
|
||||
resource "aws_security_group_rule" "worker-bgp" {
|
||||
security_group_id = aws_security_group.worker.id
|
||||
|
||||
|
@ -2,7 +2,7 @@ locals {
|
||||
# format assets for distribution
|
||||
assets_bundle = [
|
||||
# header with the unpack location
|
||||
for key, value in module.bootstrap.assets_dist:
|
||||
for key, value in module.bootstrap.assets_dist :
|
||||
format("##### %s\n%s", key, value)
|
||||
]
|
||||
}
|
||||
@ -21,7 +21,7 @@ resource "null_resource" "copy-controller-secrets" {
|
||||
user = "core"
|
||||
timeout = "15m"
|
||||
}
|
||||
|
||||
|
||||
provisioner "file" {
|
||||
content = join("\n", local.assets_bundle)
|
||||
destination = "$HOME/assets"
|
||||
|
@ -15,9 +15,14 @@ data "aws_ami" "fedora-coreos" {
|
||||
|
||||
filter {
|
||||
name = "name"
|
||||
values = ["fedora-coreos-30.*.*-hvm"]
|
||||
values = ["fedora-coreos-31.*.*.*-hvm"]
|
||||
}
|
||||
|
||||
filter {
|
||||
name = "description"
|
||||
values = ["Fedora CoreOS stable*"]
|
||||
}
|
||||
|
||||
# try to filter out dev images (AWS filters can't)
|
||||
name_regex = "^fedora-coreos-30.[0-9]*.[0-9]*-hvm*"
|
||||
name_regex = "^fedora-coreos-31.[0-9]*.[0-9]*.[0-9]*-hvm*"
|
||||
}
|
||||
|
@ -43,14 +43,13 @@ systemd:
|
||||
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \
|
||||
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
|
||||
--volume /etc/pki/tls/certs:/usr/share/ca-certificates:ro \
|
||||
--volume /var/lib/calico:/var/lib/calico \
|
||||
--volume /var/lib/calico:/var/lib/calico:ro \
|
||||
--volume /var/lib/docker:/var/lib/docker \
|
||||
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
|
||||
--volume /var/log:/var/log \
|
||||
--volume /var/run:/var/run \
|
||||
--volume /var/run/lock:/var/run/lock:z \
|
||||
--volume /opt/cni/bin:/opt/cni/bin:z \
|
||||
k8s.gcr.io/hyperkube:v1.17.0 kubelet \
|
||||
quay.io/poseidon/kubelet:v1.18.0 \
|
||||
--anonymous-auth=false \
|
||||
--authentication-token-webhook \
|
||||
--authorization-mode=Webhook \
|
||||
@ -62,13 +61,14 @@ systemd:
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--exit-on-lock-contention \
|
||||
--healthz-port=0 \
|
||||
--kubeconfig=/etc/kubernetes/kubeconfig \
|
||||
--lock-file=/var/run/lock/kubelet.lock \
|
||||
--network-plugin=cni \
|
||||
--node-labels=node.kubernetes.io/node \
|
||||
%{ for label in split(",", node_labels) }
|
||||
%{~ for label in split(",", node_labels) ~}
|
||||
--node-labels=${label} \
|
||||
%{ endfor ~}
|
||||
%{~ endfor ~}
|
||||
--pod-manifest-path=/etc/kubernetes/manifests \
|
||||
--read-only-port=0 \
|
||||
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
|
||||
@ -78,6 +78,18 @@ systemd:
|
||||
RestartSec=10
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
- name: delete-node.service
|
||||
enabled: true
|
||||
contents: |
|
||||
[Unit]
|
||||
Description=Delete Kubernetes node on shutdown
|
||||
[Service]
|
||||
Type=oneshot
|
||||
RemainAfterExit=true
|
||||
ExecStart=/bin/true
|
||||
ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.18.0 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME'
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
storage:
|
||||
directories:
|
||||
- path: /etc/kubernetes
|
||||
@ -87,10 +99,6 @@ storage:
|
||||
contents:
|
||||
inline: |
|
||||
${kubeconfig}
|
||||
- path: /etc/sysctl.d/reverse-path-filter.conf
|
||||
contents:
|
||||
inline: |
|
||||
net.ipv4.conf.all.rp_filter=1
|
||||
- path: /etc/sysctl.d/max-user-watches.conf
|
||||
contents:
|
||||
inline: |
|
||||
|
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.17.0 (upstream)
|
||||
* Kubernetes v1.18.0 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [low-priority](https://typhoon.psdn.io/cl/azure/#low-priority) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=24e5513ee6da4888005aac02101f73fc9e5e829f"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=cb170f802d09dcbe88b050257bc676e25d3c4282"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
|
||||
|
@ -7,7 +7,9 @@ systemd:
|
||||
- name: 40-etcd-cluster.conf
|
||||
contents: |
|
||||
[Service]
|
||||
Environment="ETCD_IMAGE_TAG=v3.4.3"
|
||||
Environment="ETCD_IMAGE_TAG=v3.4.5"
|
||||
Environment="ETCD_IMAGE_URL=docker://quay.io/coreos/etcd"
|
||||
Environment="RKT_RUN_ARGS=--insecure-options=image"
|
||||
Environment="ETCD_NAME=${etcd_name}"
|
||||
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
|
||||
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
|
||||
@ -50,29 +52,45 @@ systemd:
|
||||
Description=Kubelet via Hyperkube
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
EnvironmentFile=/etc/kubernetes/kubelet.env
|
||||
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/cache/kubelet-pod.uuid \
|
||||
--volume=resolv,kind=host,source=/etc/resolv.conf \
|
||||
--mount volume=resolv,target=/etc/resolv.conf \
|
||||
--volume var-lib-cni,kind=host,source=/var/lib/cni \
|
||||
--mount volume=var-lib-cni,target=/var/lib/cni \
|
||||
--volume var-lib-calico,kind=host,source=/var/lib/calico \
|
||||
--mount volume=var-lib-calico,target=/var/lib/calico \
|
||||
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
|
||||
--mount volume=opt-cni-bin,target=/opt/cni/bin \
|
||||
--volume var-log,kind=host,source=/var/log \
|
||||
--mount volume=var-log,target=/var/log \
|
||||
--hosts-entry=host \
|
||||
--insecure-options=image"
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/cni
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
|
||||
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
|
||||
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
|
||||
ExecStart=/usr/lib/coreos/kubelet-wrapper \
|
||||
ExecStart=/usr/bin/rkt run \
|
||||
--uuid-file-save=/var/cache/kubelet-pod.uuid \
|
||||
--stage1-from-dir=stage1-fly.aci \
|
||||
--hosts-entry host \
|
||||
--insecure-options=image \
|
||||
--volume etc-kubernetes,kind=host,source=/etc/kubernetes,readOnly=true \
|
||||
--mount volume=etc-kubernetes,target=/etc/kubernetes \
|
||||
--volume etc-machine-id,kind=host,source=/etc/machine-id,readOnly=true \
|
||||
--mount volume=etc-machine-id,target=/etc/machine-id \
|
||||
--volume etc-os-release,kind=host,source=/usr/lib/os-release,readOnly=true \
|
||||
--mount volume=etc-os-release,target=/etc/os-release \
|
||||
--volume=etc-resolv,kind=host,source=/etc/resolv.conf,readOnly=true \
|
||||
--mount volume=etc-resolv,target=/etc/resolv.conf \
|
||||
--volume etc-ssl-certs,kind=host,source=/etc/ssl/certs,readOnly=true \
|
||||
--mount volume=etc-ssl-certs,target=/etc/ssl/certs \
|
||||
--volume lib-modules,kind=host,source=/lib/modules,readOnly=true \
|
||||
--mount volume=lib-modules,target=/lib/modules \
|
||||
--volume run,kind=host,source=/run \
|
||||
--mount volume=run,target=/run \
|
||||
--volume usr-share-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true \
|
||||
--mount volume=usr-share-certs,target=/usr/share/ca-certificates \
|
||||
--volume var-lib-calico,kind=host,source=/var/lib/calico,readOnly=true \
|
||||
--mount volume=var-lib-calico,target=/var/lib/calico \
|
||||
--volume var-lib-docker,kind=host,source=/var/lib/docker \
|
||||
--mount volume=var-lib-docker,target=/var/lib/docker \
|
||||
--volume var-lib-kubelet,kind=host,source=/var/lib/kubelet,recursive=true \
|
||||
--mount volume=var-lib-kubelet,target=/var/lib/kubelet \
|
||||
--volume var-log,kind=host,source=/var/log \
|
||||
--mount volume=var-log,target=/var/log \
|
||||
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
|
||||
--mount volume=opt-cni-bin,target=/opt/cni/bin \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.0 -- \
|
||||
--anonymous-auth=false \
|
||||
--authentication-token-webhook \
|
||||
--authorization-mode=Webhook \
|
||||
@ -81,14 +99,15 @@ systemd:
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--exit-on-lock-contention \
|
||||
--healthz-port=0 \
|
||||
--kubeconfig=/etc/kubernetes/kubeconfig \
|
||||
--lock-file=/var/run/lock/kubelet.lock \
|
||||
--network-plugin=cni \
|
||||
--node-labels=node.kubernetes.io/master \
|
||||
--node-labels=node.kubernetes.io/controller="true" \
|
||||
--pod-manifest-path=/etc/kubernetes/manifests \
|
||||
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
|
||||
--read-only-port=0 \
|
||||
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
|
||||
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
|
||||
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
|
||||
Restart=always
|
||||
@ -104,7 +123,6 @@ systemd:
|
||||
Type=oneshot
|
||||
RemainAfterExit=true
|
||||
WorkingDirectory=/opt/bootstrap
|
||||
ExecStartPre=-/usr/bin/bash -c 'set -x && [ -n "$(ls /opt/bootstrap/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootstrap/assets/manifests-*/* /opt/bootstrap/assets/manifests && rm -rf /opt/bootstrap/assets/manifests-*'
|
||||
ExecStart=/usr/bin/rkt run \
|
||||
--trust-keys-from-https \
|
||||
--volume config,kind=host,source=/etc/kubernetes/bootstrap-secrets \
|
||||
@ -114,7 +132,7 @@ systemd:
|
||||
--volume script,kind=host,source=/opt/bootstrap/apply \
|
||||
--mount volume=script,target=/apply \
|
||||
--insecure-options=image \
|
||||
docker://k8s.gcr.io/hyperkube:v1.17.0 \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.0 \
|
||||
--net=host \
|
||||
--dns=host \
|
||||
--exec=/apply
|
||||
@ -129,14 +147,6 @@ storage:
|
||||
contents:
|
||||
inline: |
|
||||
${kubeconfig}
|
||||
- path: /etc/kubernetes/kubelet.env
|
||||
filesystem: root
|
||||
mode: 0644
|
||||
contents:
|
||||
inline: |
|
||||
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
|
||||
KUBELET_IMAGE_TAG=v1.17.0
|
||||
KUBELET_IMAGE_ARGS="--exec=/usr/local/bin/kubelet"
|
||||
- path: /opt/bootstrap/layout
|
||||
filesystem: root
|
||||
mode: 0544
|
||||
@ -157,7 +167,9 @@ storage:
|
||||
sudo mv static-manifests/* /etc/kubernetes/manifests/
|
||||
sudo mkdir -p /opt/bootstrap/assets
|
||||
sudo mv manifests /opt/bootstrap/assets/manifests
|
||||
sudo mv manifests-networking /opt/bootstrap/assets/manifests-networking
|
||||
sudo mkdir -p /opt/bootstrap/assets/manifests/crds
|
||||
sudo mv manifests-networking/crd*.yaml /opt/bootstrap/assets/manifests/crds
|
||||
sudo mv manifests-networking/* /opt/bootstrap/assets/manifests/
|
||||
rm -rf assets auth static-manifests tls
|
||||
- path: /opt/bootstrap/apply
|
||||
filesystem: root
|
||||
@ -170,6 +182,10 @@ storage:
|
||||
echo "Waiting for static pod control plane"
|
||||
sleep 5
|
||||
done
|
||||
until kubectl apply -f /assets/manifests/crds -R; do
|
||||
echo "Retry Custom Resource Definition manifests"
|
||||
sleep 5
|
||||
done
|
||||
until kubectl apply -f /assets/manifests -R; do
|
||||
echo "Retry applying manifests"
|
||||
sleep 5
|
@ -11,16 +11,15 @@ resource "azurerm_dns_a_record" "etcds" {
|
||||
ttl = 300
|
||||
|
||||
# private IPv4 address for etcd
|
||||
records = [element(
|
||||
azurerm_network_interface.controllers.*.private_ip_address,
|
||||
count.index,
|
||||
)]
|
||||
records = [azurerm_network_interface.controllers.*.private_ip_address[count.index]]
|
||||
}
|
||||
|
||||
locals {
|
||||
# Channel for a Container Linux derivative
|
||||
# Container Linux derivative
|
||||
# coreos-stable -> Container Linux Stable
|
||||
channel = element(split("-", var.os_image), 1)
|
||||
# flatcar-stable -> Flatcar Linux Stable
|
||||
flavor = split("-", var.os_image)[0]
|
||||
channel = split("-", var.os_image)[1]
|
||||
}
|
||||
|
||||
# Controller availability set to spread controllers
|
||||
@ -35,92 +34,63 @@ resource "azurerm_availability_set" "controllers" {
|
||||
}
|
||||
|
||||
# Controller instances
|
||||
resource "azurerm_virtual_machine" "controllers" {
|
||||
resource "azurerm_linux_virtual_machine" "controllers" {
|
||||
count = var.controller_count
|
||||
resource_group_name = azurerm_resource_group.cluster.name
|
||||
|
||||
name = "${var.cluster_name}-controller-${count.index}"
|
||||
location = var.region
|
||||
availability_set_id = azurerm_availability_set.controllers.id
|
||||
vm_size = var.controller_type
|
||||
|
||||
# boot
|
||||
storage_image_reference {
|
||||
publisher = "CoreOS"
|
||||
offer = "CoreOS"
|
||||
size = var.controller_type
|
||||
custom_data = base64encode(data.ct_config.controller-ignitions.*.rendered[count.index])
|
||||
|
||||
# storage
|
||||
os_disk {
|
||||
name = "${var.cluster_name}-controller-${count.index}"
|
||||
caching = "None"
|
||||
disk_size_gb = var.disk_size
|
||||
storage_account_type = "Premium_LRS"
|
||||
}
|
||||
|
||||
source_image_reference {
|
||||
publisher = local.flavor == "flatcar" ? "Kinvolk" : "CoreOS"
|
||||
offer = local.flavor == "flatcar" ? "flatcar-container-linux" : "CoreOS"
|
||||
sku = local.channel
|
||||
version = "latest"
|
||||
}
|
||||
|
||||
# storage
|
||||
storage_os_disk {
|
||||
name = "${var.cluster_name}-controller-${count.index}"
|
||||
create_option = "FromImage"
|
||||
caching = "ReadWrite"
|
||||
disk_size_gb = var.disk_size
|
||||
os_type = "Linux"
|
||||
managed_disk_type = "Premium_LRS"
|
||||
}
|
||||
# Gross hack just for Flatcar Linux
|
||||
dynamic "plan" {
|
||||
for_each = local.flavor == "flatcar" ? [1] : []
|
||||
|
||||
# network
|
||||
network_interface_ids = [element(azurerm_network_interface.controllers.*.id, count.index)]
|
||||
|
||||
os_profile {
|
||||
computer_name = "${var.cluster_name}-controller-${count.index}"
|
||||
admin_username = "core"
|
||||
custom_data = element(data.ct_config.controller-ignitions.*.rendered, count.index)
|
||||
}
|
||||
|
||||
# Azure mandates setting an ssh_key, even though Ignition custom_data handles it too
|
||||
os_profile_linux_config {
|
||||
disable_password_authentication = true
|
||||
|
||||
ssh_keys {
|
||||
path = "/home/core/.ssh/authorized_keys"
|
||||
key_data = var.ssh_authorized_key
|
||||
content {
|
||||
name = local.channel
|
||||
publisher = "kinvolk"
|
||||
product = "flatcar-container-linux"
|
||||
}
|
||||
}
|
||||
|
||||
# lifecycle
|
||||
delete_os_disk_on_termination = true
|
||||
delete_data_disks_on_termination = true
|
||||
# network
|
||||
network_interface_ids = [
|
||||
azurerm_network_interface.controllers.*.id[count.index]
|
||||
]
|
||||
|
||||
# Azure requires setting admin_ssh_key, though Ignition custom_data handles it too
|
||||
admin_username = "core"
|
||||
admin_ssh_key {
|
||||
username = "core"
|
||||
public_key = var.ssh_authorized_key
|
||||
}
|
||||
|
||||
lifecycle {
|
||||
ignore_changes = [
|
||||
storage_os_disk,
|
||||
os_profile,
|
||||
os_disk,
|
||||
custom_data,
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
# Controller NICs with public and private IPv4
|
||||
resource "azurerm_network_interface" "controllers" {
|
||||
count = var.controller_count
|
||||
resource_group_name = azurerm_resource_group.cluster.name
|
||||
|
||||
name = "${var.cluster_name}-controller-${count.index}"
|
||||
location = azurerm_resource_group.cluster.location
|
||||
network_security_group_id = azurerm_network_security_group.controller.id
|
||||
|
||||
ip_configuration {
|
||||
name = "ip0"
|
||||
subnet_id = azurerm_subnet.controller.id
|
||||
private_ip_address_allocation = "dynamic"
|
||||
|
||||
# public IPv4
|
||||
public_ip_address_id = element(azurerm_public_ip.controllers.*.id, count.index)
|
||||
}
|
||||
}
|
||||
|
||||
# Add controller NICs to the controller backend address pool
|
||||
resource "azurerm_network_interface_backend_address_pool_association" "controllers" {
|
||||
count = var.controller_count
|
||||
|
||||
network_interface_id = azurerm_network_interface.controllers[count.index].id
|
||||
ip_configuration_name = "ip0"
|
||||
backend_address_pool_id = azurerm_lb_backend_address_pool.controller.id
|
||||
}
|
||||
|
||||
# Controller public IPv4 addresses
|
||||
resource "azurerm_public_ip" "controllers" {
|
||||
count = var.controller_count
|
||||
@ -132,13 +102,44 @@ resource "azurerm_public_ip" "controllers" {
|
||||
allocation_method = "Static"
|
||||
}
|
||||
|
||||
# Controller NICs with public and private IPv4
|
||||
resource "azurerm_network_interface" "controllers" {
|
||||
count = var.controller_count
|
||||
resource_group_name = azurerm_resource_group.cluster.name
|
||||
|
||||
name = "${var.cluster_name}-controller-${count.index}"
|
||||
location = azurerm_resource_group.cluster.location
|
||||
|
||||
ip_configuration {
|
||||
name = "ip0"
|
||||
subnet_id = azurerm_subnet.controller.id
|
||||
private_ip_address_allocation = "Dynamic"
|
||||
# instance public IPv4
|
||||
public_ip_address_id = azurerm_public_ip.controllers.*.id[count.index]
|
||||
}
|
||||
}
|
||||
|
||||
# Associate controller network interface with controller security group
|
||||
resource "azurerm_network_interface_security_group_association" "controllers" {
|
||||
count = var.controller_count
|
||||
|
||||
network_interface_id = azurerm_network_interface.controllers[count.index].id
|
||||
network_security_group_id = azurerm_network_security_group.controller.id
|
||||
}
|
||||
|
||||
# Associate controller network interface with controller backend address pool
|
||||
resource "azurerm_network_interface_backend_address_pool_association" "controllers" {
|
||||
count = var.controller_count
|
||||
|
||||
network_interface_id = azurerm_network_interface.controllers[count.index].id
|
||||
ip_configuration_name = "ip0"
|
||||
backend_address_pool_id = azurerm_lb_backend_address_pool.controller.id
|
||||
}
|
||||
|
||||
# Controller Ignition configs
|
||||
data "ct_config" "controller-ignitions" {
|
||||
count = var.controller_count
|
||||
content = element(
|
||||
data.template_file.controller-configs.*.rendered,
|
||||
count.index,
|
||||
)
|
||||
count = var.controller_count
|
||||
content = data.template_file.controller-configs.*.rendered[count.index]
|
||||
pretty_print = false
|
||||
snippets = var.controller_clc_snippets
|
||||
}
|
||||
@ -147,7 +148,7 @@ data "ct_config" "controller-ignitions" {
|
||||
data "template_file" "controller-configs" {
|
||||
count = var.controller_count
|
||||
|
||||
template = file("${path.module}/cl/controller.yaml.tmpl")
|
||||
template = file("${path.module}/cl/controller.yaml")
|
||||
|
||||
vars = {
|
||||
# Cannot use cyclic dependencies on controllers or their DNS records
|
||||
|
@ -24,6 +24,11 @@ resource "azurerm_subnet" "controller" {
|
||||
address_prefix = cidrsubnet(var.host_cidr, 1, 0)
|
||||
}
|
||||
|
||||
resource "azurerm_subnet_network_security_group_association" "controller" {
|
||||
subnet_id = azurerm_subnet.controller.id
|
||||
network_security_group_id = azurerm_network_security_group.controller.id
|
||||
}
|
||||
|
||||
resource "azurerm_subnet" "worker" {
|
||||
resource_group_name = azurerm_resource_group.cluster.name
|
||||
|
||||
@ -32,3 +37,8 @@ resource "azurerm_subnet" "worker" {
|
||||
address_prefix = cidrsubnet(var.host_cidr, 1, 1)
|
||||
}
|
||||
|
||||
resource "azurerm_subnet_network_security_group_association" "worker" {
|
||||
subnet_id = azurerm_subnet.worker.id
|
||||
network_security_group_id = azurerm_network_security_group.worker.id
|
||||
}
|
||||
|
||||
|
@ -53,13 +53,29 @@ resource "azurerm_network_security_rule" "controller-etcd-metrics" {
|
||||
destination_address_prefix = azurerm_subnet.controller.address_prefix
|
||||
}
|
||||
|
||||
# Allow Prometheus to scrape kube-proxy metrics
|
||||
resource "azurerm_network_security_rule" "controller-kube-proxy" {
|
||||
resource_group_name = azurerm_resource_group.cluster.name
|
||||
|
||||
name = "allow-kube-proxy-metrics"
|
||||
network_security_group_name = azurerm_network_security_group.controller.name
|
||||
priority = "2011"
|
||||
access = "Allow"
|
||||
direction = "Inbound"
|
||||
protocol = "Tcp"
|
||||
source_port_range = "*"
|
||||
destination_port_range = "10249"
|
||||
source_address_prefix = azurerm_subnet.worker.address_prefix
|
||||
destination_address_prefix = azurerm_subnet.controller.address_prefix
|
||||
}
|
||||
|
||||
# Allow Prometheus to scrape kube-scheduler and kube-controller-manager metrics
|
||||
resource "azurerm_network_security_rule" "controller-kube-metrics" {
|
||||
resource_group_name = azurerm_resource_group.cluster.name
|
||||
|
||||
name = "allow-kube-metrics"
|
||||
network_security_group_name = azurerm_network_security_group.controller.name
|
||||
priority = "2011"
|
||||
priority = "2012"
|
||||
access = "Allow"
|
||||
direction = "Inbound"
|
||||
protocol = "Tcp"
|
||||
@ -251,6 +267,22 @@ resource "azurerm_network_security_rule" "worker-node-exporter" {
|
||||
destination_address_prefix = azurerm_subnet.worker.address_prefix
|
||||
}
|
||||
|
||||
# Allow Prometheus to scrape kube-proxy
|
||||
resource "azurerm_network_security_rule" "worker-kube-proxy" {
|
||||
resource_group_name = azurerm_resource_group.cluster.name
|
||||
|
||||
name = "allow-kube-proxy"
|
||||
network_security_group_name = azurerm_network_security_group.worker.name
|
||||
priority = "2024"
|
||||
access = "Allow"
|
||||
direction = "Inbound"
|
||||
protocol = "Tcp"
|
||||
source_port_range = "*"
|
||||
destination_port_range = "10249"
|
||||
source_address_prefix = azurerm_subnet.worker.address_prefix
|
||||
destination_address_prefix = azurerm_subnet.worker.address_prefix
|
||||
}
|
||||
|
||||
# Allow apiserver to access kubelet's for exec, log, port-forward
|
||||
resource "azurerm_network_security_rule" "worker-kubelet" {
|
||||
resource_group_name = azurerm_resource_group.cluster.name
|
||||
|
@ -2,7 +2,7 @@ locals {
|
||||
# format assets for distribution
|
||||
assets_bundle = [
|
||||
# header with the unpack location
|
||||
for key, value in module.bootstrap.assets_dist:
|
||||
for key, value in module.bootstrap.assets_dist :
|
||||
format("##### %s\n%s", key, value)
|
||||
]
|
||||
}
|
||||
@ -13,7 +13,7 @@ resource "null_resource" "copy-controller-secrets" {
|
||||
|
||||
depends_on = [
|
||||
module.bootstrap,
|
||||
azurerm_virtual_machine.controllers
|
||||
azurerm_linux_virtual_machine.controllers
|
||||
]
|
||||
|
||||
connection {
|
||||
@ -22,7 +22,7 @@ resource "null_resource" "copy-controller-secrets" {
|
||||
user = "core"
|
||||
timeout = "15m"
|
||||
}
|
||||
|
||||
|
||||
provisioner "file" {
|
||||
content = join("\n", local.assets_bundle)
|
||||
destination = "$HOME/assets"
|
||||
@ -45,7 +45,7 @@ resource "null_resource" "bootstrap" {
|
||||
|
||||
connection {
|
||||
type = "ssh"
|
||||
host = element(azurerm_public_ip.controllers.*.ip_address, 0)
|
||||
host = azurerm_public_ip.controllers.*.ip_address[0]
|
||||
user = "core"
|
||||
timeout = "15m"
|
||||
}
|
||||
|
@ -49,7 +49,7 @@ variable "worker_type" {
|
||||
variable "os_image" {
|
||||
type = string
|
||||
default = "coreos-stable"
|
||||
description = "Channel for a Container Linux derivative (coreos-stable, coreos-beta, coreos-alpha)"
|
||||
description = "Channel for a Container Linux derivative (coreos-stable, coreos-beta, coreos-alpha, flatcar-stable, flatcar-beta)"
|
||||
}
|
||||
|
||||
variable "disk_size" {
|
||||
|
@ -3,7 +3,7 @@
|
||||
terraform {
|
||||
required_version = "~> 0.12.6"
|
||||
required_providers {
|
||||
azurerm = "~> 1.27"
|
||||
azurerm = "~> 2.0"
|
||||
ct = "~> 0.3"
|
||||
template = "~> 2.1"
|
||||
null = "~> 2.1"
|
||||
|
@ -22,4 +22,3 @@ module "workers" {
|
||||
clc_snippets = var.worker_clc_snippets
|
||||
node_labels = var.worker_node_labels
|
||||
}
|
||||
|
||||
|
@ -25,28 +25,45 @@ systemd:
|
||||
Description=Kubelet via Hyperkube
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
EnvironmentFile=/etc/kubernetes/kubelet.env
|
||||
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/cache/kubelet-pod.uuid \
|
||||
--volume=resolv,kind=host,source=/etc/resolv.conf \
|
||||
--mount volume=resolv,target=/etc/resolv.conf \
|
||||
--volume var-lib-cni,kind=host,source=/var/lib/cni \
|
||||
--mount volume=var-lib-cni,target=/var/lib/cni \
|
||||
--volume var-lib-calico,kind=host,source=/var/lib/calico \
|
||||
--mount volume=var-lib-calico,target=/var/lib/calico \
|
||||
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
|
||||
--mount volume=opt-cni-bin,target=/opt/cni/bin \
|
||||
--volume var-log,kind=host,source=/var/log \
|
||||
--mount volume=var-log,target=/var/log \
|
||||
--insecure-options=image"
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/cni
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
|
||||
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
|
||||
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
|
||||
ExecStart=/usr/lib/coreos/kubelet-wrapper \
|
||||
ExecStart=/usr/bin/rkt run \
|
||||
--uuid-file-save=/var/cache/kubelet-pod.uuid \
|
||||
--stage1-from-dir=stage1-fly.aci \
|
||||
--hosts-entry host \
|
||||
--insecure-options=image \
|
||||
--volume etc-kubernetes,kind=host,source=/etc/kubernetes,readOnly=true \
|
||||
--mount volume=etc-kubernetes,target=/etc/kubernetes \
|
||||
--volume etc-machine-id,kind=host,source=/etc/machine-id,readOnly=true \
|
||||
--mount volume=etc-machine-id,target=/etc/machine-id \
|
||||
--volume etc-os-release,kind=host,source=/usr/lib/os-release,readOnly=true \
|
||||
--mount volume=etc-os-release,target=/etc/os-release \
|
||||
--volume=etc-resolv,kind=host,source=/etc/resolv.conf,readOnly=true \
|
||||
--mount volume=etc-resolv,target=/etc/resolv.conf \
|
||||
--volume etc-ssl-certs,kind=host,source=/etc/ssl/certs,readOnly=true \
|
||||
--mount volume=etc-ssl-certs,target=/etc/ssl/certs \
|
||||
--volume lib-modules,kind=host,source=/lib/modules,readOnly=true \
|
||||
--mount volume=lib-modules,target=/lib/modules \
|
||||
--volume run,kind=host,source=/run \
|
||||
--mount volume=run,target=/run \
|
||||
--volume usr-share-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true \
|
||||
--mount volume=usr-share-certs,target=/usr/share/ca-certificates \
|
||||
--volume var-lib-calico,kind=host,source=/var/lib/calico,readOnly=true \
|
||||
--mount volume=var-lib-calico,target=/var/lib/calico \
|
||||
--volume var-lib-docker,kind=host,source=/var/lib/docker \
|
||||
--mount volume=var-lib-docker,target=/var/lib/docker \
|
||||
--volume var-lib-kubelet,kind=host,source=/var/lib/kubelet,recursive=true \
|
||||
--mount volume=var-lib-kubelet,target=/var/lib/kubelet \
|
||||
--volume var-log,kind=host,source=/var/log \
|
||||
--mount volume=var-log,target=/var/log \
|
||||
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
|
||||
--mount volume=opt-cni-bin,target=/opt/cni/bin \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.0 -- \
|
||||
--anonymous-auth=false \
|
||||
--authentication-token-webhook \
|
||||
--authorization-mode=Webhook \
|
||||
@ -55,13 +72,14 @@ systemd:
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--exit-on-lock-contention \
|
||||
--healthz-port=0 \
|
||||
--kubeconfig=/etc/kubernetes/kubeconfig \
|
||||
--lock-file=/var/run/lock/kubelet.lock \
|
||||
--network-plugin=cni \
|
||||
--node-labels=node.kubernetes.io/node \
|
||||
%{ for label in split(",", node_labels) }
|
||||
%{~ for label in split(",", node_labels) ~}
|
||||
--node-labels=${label} \
|
||||
%{ endfor ~}
|
||||
%{~ endfor ~}
|
||||
--pod-manifest-path=/etc/kubernetes/manifests \
|
||||
--read-only-port=0 \
|
||||
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
|
||||
@ -90,14 +108,6 @@ storage:
|
||||
contents:
|
||||
inline: |
|
||||
${kubeconfig}
|
||||
- path: /etc/kubernetes/kubelet.env
|
||||
filesystem: root
|
||||
mode: 0644
|
||||
contents:
|
||||
inline: |
|
||||
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
|
||||
KUBELET_IMAGE_TAG=v1.17.0
|
||||
KUBELET_IMAGE_ARGS="--exec=/usr/local/bin/kubelet"
|
||||
- path: /etc/sysctl.d/max-user-watches.conf
|
||||
filesystem: root
|
||||
contents:
|
||||
@ -115,11 +125,10 @@ storage:
|
||||
--volume config,kind=host,source=/etc/kubernetes \
|
||||
--mount volume=config,target=/etc/kubernetes \
|
||||
--insecure-options=image \
|
||||
docker://k8s.gcr.io/hyperkube:v1.17.0 \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.0 \
|
||||
--net=host \
|
||||
--dns=host \
|
||||
-- \
|
||||
kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname | tr '[:upper:]' '[:lower:]')
|
||||
--exec=/usr/local/bin/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname | tr '[:upper:]' '[:lower:]')
|
||||
passwd:
|
||||
users:
|
||||
- name: core
|
@ -1,57 +1,56 @@
|
||||
locals {
|
||||
# Channel for a Container Linux derivative
|
||||
# coreos-stable -> Container Linux Stable
|
||||
channel = element(split("-", var.os_image), 1)
|
||||
# flatcar-stable -> Flatcar Linux Stable
|
||||
flavor = split("-", var.os_image)[0]
|
||||
channel = split("-", var.os_image)[1]
|
||||
}
|
||||
|
||||
# Workers scale set
|
||||
resource "azurerm_virtual_machine_scale_set" "workers" {
|
||||
resource "azurerm_linux_virtual_machine_scale_set" "workers" {
|
||||
resource_group_name = var.resource_group_name
|
||||
|
||||
name = "${var.name}-workers"
|
||||
name = "${var.name}-worker"
|
||||
location = var.region
|
||||
sku = var.vm_type
|
||||
instances = var.worker_count
|
||||
# instance name prefix for instances in the set
|
||||
computer_name_prefix = "${var.name}-worker"
|
||||
single_placement_group = false
|
||||
custom_data = base64encode(data.ct_config.worker-ignition.rendered)
|
||||
|
||||
sku {
|
||||
name = var.vm_type
|
||||
tier = "standard"
|
||||
capacity = var.worker_count
|
||||
# storage
|
||||
os_disk {
|
||||
storage_account_type = "Standard_LRS"
|
||||
caching = "ReadWrite"
|
||||
}
|
||||
|
||||
# boot
|
||||
storage_profile_image_reference {
|
||||
publisher = "CoreOS"
|
||||
offer = "CoreOS"
|
||||
source_image_reference {
|
||||
publisher = local.flavor == "flatcar" ? "Kinvolk" : "CoreOS"
|
||||
offer = local.flavor == "flatcar" ? "flatcar-container-linux" : "CoreOS"
|
||||
sku = local.channel
|
||||
version = "latest"
|
||||
}
|
||||
|
||||
# storage
|
||||
storage_profile_os_disk {
|
||||
create_option = "FromImage"
|
||||
caching = "ReadWrite"
|
||||
os_type = "linux"
|
||||
managed_disk_type = "Standard_LRS"
|
||||
}
|
||||
# Gross hack just for Flatcar Linux
|
||||
dynamic "plan" {
|
||||
for_each = local.flavor == "flatcar" ? [1] : []
|
||||
|
||||
os_profile {
|
||||
computer_name_prefix = "${var.name}-worker-"
|
||||
admin_username = "core"
|
||||
custom_data = data.ct_config.worker-ignition.rendered
|
||||
}
|
||||
|
||||
# Azure mandates setting an ssh_key, even though Ignition custom_data handles it too
|
||||
os_profile_linux_config {
|
||||
disable_password_authentication = true
|
||||
|
||||
ssh_keys {
|
||||
path = "/home/core/.ssh/authorized_keys"
|
||||
key_data = var.ssh_authorized_key
|
||||
content {
|
||||
name = local.channel
|
||||
publisher = "kinvolk"
|
||||
product = "flatcar-container-linux"
|
||||
}
|
||||
}
|
||||
|
||||
# Azure requires setting admin_ssh_key, though Ignition custom_data handles it too
|
||||
admin_username = "core"
|
||||
admin_ssh_key {
|
||||
username = "core"
|
||||
public_key = var.ssh_authorized_key
|
||||
}
|
||||
|
||||
# network
|
||||
network_profile {
|
||||
network_interface {
|
||||
name = "nic0"
|
||||
primary = true
|
||||
network_security_group_id = var.security_group_id
|
||||
@ -67,10 +66,10 @@ resource "azurerm_virtual_machine_scale_set" "workers" {
|
||||
}
|
||||
|
||||
# lifecycle
|
||||
upgrade_policy_mode = "Manual"
|
||||
# eviction policy may only be set when priority is Low
|
||||
upgrade_mode = "Manual"
|
||||
# eviction policy may only be set when priority is Spot
|
||||
priority = var.priority
|
||||
eviction_policy = var.priority == "Low" ? "Delete" : null
|
||||
eviction_policy = var.priority == "Spot" ? "Delete" : null
|
||||
}
|
||||
|
||||
# Scale up or down to maintain desired number, tolerating deallocations.
|
||||
@ -82,7 +81,7 @@ resource "azurerm_monitor_autoscale_setting" "workers" {
|
||||
|
||||
# autoscale
|
||||
enabled = true
|
||||
target_resource_id = azurerm_virtual_machine_scale_set.workers.id
|
||||
target_resource_id = azurerm_linux_virtual_machine_scale_set.workers.id
|
||||
|
||||
profile {
|
||||
name = "default"
|
||||
@ -104,7 +103,7 @@ data "ct_config" "worker-ignition" {
|
||||
|
||||
# Worker Container Linux configs
|
||||
data "template_file" "worker-config" {
|
||||
template = file("${path.module}/cl/worker.yaml.tmpl")
|
||||
template = file("${path.module}/cl/worker.yaml")
|
||||
|
||||
vars = {
|
||||
kubeconfig = indent(10, var.kubeconfig)
|
||||
|
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.17.0 (upstream)
|
||||
* Kubernetes v1.18.0 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=24e5513ee6da4888005aac02101f73fc9e5e829f"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=cb170f802d09dcbe88b050257bc676e25d3c4282"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [var.k8s_domain_name]
|
||||
|
@ -7,7 +7,9 @@ systemd:
|
||||
- name: 40-etcd-cluster.conf
|
||||
contents: |
|
||||
[Service]
|
||||
Environment="ETCD_IMAGE_TAG=v3.4.3"
|
||||
Environment="ETCD_IMAGE_TAG=v3.4.5"
|
||||
Environment="ETCD_IMAGE_URL=docker://quay.io/coreos/etcd"
|
||||
Environment="RKT_RUN_ARGS=--insecure-options=image"
|
||||
Environment="ETCD_NAME=${etcd_name}"
|
||||
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${domain_name}:2379"
|
||||
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${domain_name}:2380"
|
||||
@ -58,33 +60,50 @@ systemd:
|
||||
Description=Kubelet via Hyperkube
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
EnvironmentFile=/etc/kubernetes/kubelet.env
|
||||
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/cache/kubelet-pod.uuid \
|
||||
--volume=resolv,kind=host,source=/etc/resolv.conf \
|
||||
--mount volume=resolv,target=/etc/resolv.conf \
|
||||
--volume var-lib-cni,kind=host,source=/var/lib/cni \
|
||||
--mount volume=var-lib-cni,target=/var/lib/cni \
|
||||
--volume var-lib-calico,kind=host,source=/var/lib/calico \
|
||||
--mount volume=var-lib-calico,target=/var/lib/calico \
|
||||
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
|
||||
--mount volume=opt-cni-bin,target=/opt/cni/bin \
|
||||
--volume var-log,kind=host,source=/var/log \
|
||||
--mount volume=var-log,target=/var/log \
|
||||
--volume iscsiconf,kind=host,source=/etc/iscsi/ \
|
||||
--mount volume=iscsiconf,target=/etc/iscsi/ \
|
||||
--volume iscsiadm,kind=host,source=/usr/sbin/iscsiadm \
|
||||
--mount volume=iscsiadm,target=/sbin/iscsiadm \
|
||||
--insecure-options=image"
|
||||
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/cni
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
|
||||
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
|
||||
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
|
||||
ExecStart=/usr/lib/coreos/kubelet-wrapper \
|
||||
ExecStart=/usr/bin/rkt run \
|
||||
--uuid-file-save=/var/cache/kubelet-pod.uuid \
|
||||
--stage1-from-dir=stage1-fly.aci \
|
||||
--hosts-entry host \
|
||||
--insecure-options=image \
|
||||
--volume etc-kubernetes,kind=host,source=/etc/kubernetes,readOnly=true \
|
||||
--mount volume=etc-kubernetes,target=/etc/kubernetes \
|
||||
--volume etc-machine-id,kind=host,source=/etc/machine-id,readOnly=true \
|
||||
--mount volume=etc-machine-id,target=/etc/machine-id \
|
||||
--volume etc-os-release,kind=host,source=/usr/lib/os-release,readOnly=true \
|
||||
--mount volume=etc-os-release,target=/etc/os-release \
|
||||
--volume=etc-resolv,kind=host,source=/etc/resolv.conf,readOnly=true \
|
||||
--mount volume=etc-resolv,target=/etc/resolv.conf \
|
||||
--volume etc-ssl-certs,kind=host,source=/etc/ssl/certs,readOnly=true \
|
||||
--mount volume=etc-ssl-certs,target=/etc/ssl/certs \
|
||||
--volume lib-modules,kind=host,source=/lib/modules,readOnly=true \
|
||||
--mount volume=lib-modules,target=/lib/modules \
|
||||
--volume run,kind=host,source=/run \
|
||||
--mount volume=run,target=/run \
|
||||
--volume usr-share-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true \
|
||||
--mount volume=usr-share-certs,target=/usr/share/ca-certificates \
|
||||
--volume var-lib-calico,kind=host,source=/var/lib/calico,readOnly=true \
|
||||
--mount volume=var-lib-calico,target=/var/lib/calico \
|
||||
--volume var-lib-docker,kind=host,source=/var/lib/docker \
|
||||
--mount volume=var-lib-docker,target=/var/lib/docker \
|
||||
--volume var-lib-kubelet,kind=host,source=/var/lib/kubelet,recursive=true \
|
||||
--mount volume=var-lib-kubelet,target=/var/lib/kubelet \
|
||||
--volume var-log,kind=host,source=/var/log \
|
||||
--mount volume=var-log,target=/var/log \
|
||||
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
|
||||
--mount volume=opt-cni-bin,target=/opt/cni/bin \
|
||||
--volume etc-iscsi,kind=host,source=/etc/iscsi \
|
||||
--mount volume=etc-iscsi,target=/etc/iscsi \
|
||||
--volume usr-sbin-iscsiadm,kind=host,source=/usr/sbin/iscsiadm \
|
||||
--mount volume=usr-sbin-iscsiadm,target=/sbin/iscsiadm \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.0 -- \
|
||||
--anonymous-auth=false \
|
||||
--authentication-token-webhook \
|
||||
--authorization-mode=Webhook \
|
||||
@ -94,6 +113,7 @@ systemd:
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--exit-on-lock-contention \
|
||||
--healthz-port=0 \
|
||||
--hostname-override=${domain_name} \
|
||||
--kubeconfig=/etc/kubernetes/kubeconfig \
|
||||
--lock-file=/var/run/lock/kubelet.lock \
|
||||
@ -118,7 +138,6 @@ systemd:
|
||||
Type=oneshot
|
||||
RemainAfterExit=true
|
||||
WorkingDirectory=/opt/bootstrap
|
||||
ExecStartPre=-/usr/bin/bash -c 'set -x && [ -n "$(ls /opt/bootstrap/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootstrap/assets/manifests-*/* /opt/bootstrap/assets/manifests && rm -rf /opt/bootstrap/assets/manifests-*'
|
||||
ExecStart=/usr/bin/rkt run \
|
||||
--trust-keys-from-https \
|
||||
--volume config,kind=host,source=/etc/kubernetes/bootstrap-secrets \
|
||||
@ -128,7 +147,7 @@ systemd:
|
||||
--volume script,kind=host,source=/opt/bootstrap/apply \
|
||||
--mount volume=script,target=/apply \
|
||||
--insecure-options=image \
|
||||
docker://k8s.gcr.io/hyperkube:v1.17.0 \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.0 \
|
||||
--net=host \
|
||||
--dns=host \
|
||||
--exec=/apply
|
||||
@ -136,15 +155,10 @@ systemd:
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
storage:
|
||||
files:
|
||||
- path: /etc/kubernetes/kubelet.env
|
||||
directories:
|
||||
- path: /etc/kubernetes
|
||||
filesystem: root
|
||||
mode: 0644
|
||||
contents:
|
||||
inline: |
|
||||
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
|
||||
KUBELET_IMAGE_TAG=v1.17.0
|
||||
KUBELET_IMAGE_ARGS="--exec=/usr/local/bin/kubelet"
|
||||
files:
|
||||
- path: /etc/hostname
|
||||
filesystem: root
|
||||
mode: 0644
|
||||
@ -171,7 +185,9 @@ storage:
|
||||
sudo mv static-manifests/* /etc/kubernetes/manifests/
|
||||
sudo mkdir -p /opt/bootstrap/assets
|
||||
sudo mv manifests /opt/bootstrap/assets/manifests
|
||||
sudo mv manifests-networking /opt/bootstrap/assets/manifests-networking
|
||||
sudo mkdir -p /opt/bootstrap/assets/manifests/crds
|
||||
sudo mv manifests-networking/crd*.yaml /opt/bootstrap/assets/manifests/crds
|
||||
sudo mv manifests-networking/* /opt/bootstrap/assets/manifests/
|
||||
rm -rf assets auth static-manifests tls
|
||||
- path: /opt/bootstrap/apply
|
||||
filesystem: root
|
||||
@ -184,6 +200,10 @@ storage:
|
||||
echo "Waiting for static pod control plane"
|
||||
sleep 5
|
||||
done
|
||||
until kubectl apply -f /assets/manifests/crds -R; do
|
||||
echo "Retry Custom Resource Definition manifests"
|
||||
sleep 5
|
||||
done
|
||||
until kubectl apply -f /assets/manifests -R; do
|
||||
echo "Retry applying manifests"
|
||||
sleep 5
|
@ -33,33 +33,50 @@ systemd:
|
||||
Description=Kubelet via Hyperkube
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
EnvironmentFile=/etc/kubernetes/kubelet.env
|
||||
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/cache/kubelet-pod.uuid \
|
||||
--volume=resolv,kind=host,source=/etc/resolv.conf \
|
||||
--mount volume=resolv,target=/etc/resolv.conf \
|
||||
--volume var-lib-cni,kind=host,source=/var/lib/cni \
|
||||
--mount volume=var-lib-cni,target=/var/lib/cni \
|
||||
--volume var-lib-calico,kind=host,source=/var/lib/calico \
|
||||
--mount volume=var-lib-calico,target=/var/lib/calico \
|
||||
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
|
||||
--mount volume=opt-cni-bin,target=/opt/cni/bin \
|
||||
--volume var-log,kind=host,source=/var/log \
|
||||
--mount volume=var-log,target=/var/log \
|
||||
--volume iscsiconf,kind=host,source=/etc/iscsi/ \
|
||||
--mount volume=iscsiconf,target=/etc/iscsi/ \
|
||||
--volume iscsiadm,kind=host,source=/usr/sbin/iscsiadm \
|
||||
--mount volume=iscsiadm,target=/sbin/iscsiadm \
|
||||
--insecure-options=image"
|
||||
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/cni
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
|
||||
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
|
||||
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
|
||||
ExecStart=/usr/lib/coreos/kubelet-wrapper \
|
||||
ExecStart=/usr/bin/rkt run \
|
||||
--uuid-file-save=/var/cache/kubelet-pod.uuid \
|
||||
--stage1-from-dir=stage1-fly.aci \
|
||||
--hosts-entry host \
|
||||
--insecure-options=image \
|
||||
--volume etc-kubernetes,kind=host,source=/etc/kubernetes,readOnly=true \
|
||||
--mount volume=etc-kubernetes,target=/etc/kubernetes \
|
||||
--volume etc-machine-id,kind=host,source=/etc/machine-id,readOnly=true \
|
||||
--mount volume=etc-machine-id,target=/etc/machine-id \
|
||||
--volume etc-os-release,kind=host,source=/usr/lib/os-release,readOnly=true \
|
||||
--mount volume=etc-os-release,target=/etc/os-release \
|
||||
--volume=etc-resolv,kind=host,source=/etc/resolv.conf,readOnly=true \
|
||||
--mount volume=etc-resolv,target=/etc/resolv.conf \
|
||||
--volume etc-ssl-certs,kind=host,source=/etc/ssl/certs,readOnly=true \
|
||||
--mount volume=etc-ssl-certs,target=/etc/ssl/certs \
|
||||
--volume lib-modules,kind=host,source=/lib/modules,readOnly=true \
|
||||
--mount volume=lib-modules,target=/lib/modules \
|
||||
--volume run,kind=host,source=/run \
|
||||
--mount volume=run,target=/run \
|
||||
--volume usr-share-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true \
|
||||
--mount volume=usr-share-certs,target=/usr/share/ca-certificates \
|
||||
--volume var-lib-calico,kind=host,source=/var/lib/calico,readOnly=true \
|
||||
--mount volume=var-lib-calico,target=/var/lib/calico \
|
||||
--volume var-lib-docker,kind=host,source=/var/lib/docker \
|
||||
--mount volume=var-lib-docker,target=/var/lib/docker \
|
||||
--volume var-lib-kubelet,kind=host,source=/var/lib/kubelet,recursive=true \
|
||||
--mount volume=var-lib-kubelet,target=/var/lib/kubelet \
|
||||
--volume var-log,kind=host,source=/var/log \
|
||||
--mount volume=var-log,target=/var/log \
|
||||
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
|
||||
--mount volume=opt-cni-bin,target=/opt/cni/bin \
|
||||
--volume etc-iscsi,kind=host,source=/etc/iscsi \
|
||||
--mount volume=etc-iscsi,target=/etc/iscsi \
|
||||
--volume usr-sbin-iscsiadm,kind=host,source=/usr/sbin/iscsiadm \
|
||||
--mount volume=usr-sbin-iscsiadm,target=/sbin/iscsiadm \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.0 -- \
|
||||
--anonymous-auth=false \
|
||||
--authentication-token-webhook \
|
||||
--authorization-mode=Webhook \
|
||||
@ -69,11 +86,18 @@ systemd:
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--exit-on-lock-contention \
|
||||
--healthz-port=0 \
|
||||
--hostname-override=${domain_name} \
|
||||
--kubeconfig=/etc/kubernetes/kubeconfig \
|
||||
--lock-file=/var/run/lock/kubelet.lock \
|
||||
--network-plugin=cni \
|
||||
--node-labels=node.kubernetes.io/node \
|
||||
%{~ for label in compact(split(",", node_labels)) ~}
|
||||
--node-labels=${label} \
|
||||
%{~ endfor ~}
|
||||
%{~ for taint in compact(split(",", node_taints)) ~}
|
||||
--register-with-taints=${taint} \
|
||||
%{~ endfor ~}
|
||||
--pod-manifest-path=/etc/kubernetes/manifests \
|
||||
--read-only-port=0 \
|
||||
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
|
||||
@ -84,15 +108,10 @@ systemd:
|
||||
WantedBy=multi-user.target
|
||||
|
||||
storage:
|
||||
files:
|
||||
- path: /etc/kubernetes/kubelet.env
|
||||
directories:
|
||||
- path: /etc/kubernetes
|
||||
filesystem: root
|
||||
mode: 0644
|
||||
contents:
|
||||
inline: |
|
||||
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
|
||||
KUBELET_IMAGE_TAG=v1.17.0
|
||||
KUBELET_IMAGE_ARGS="--exec=/usr/local/bin/kubelet"
|
||||
files:
|
||||
- path: /etc/hostname
|
||||
filesystem: root
|
||||
mode: 0644
|
@ -31,7 +31,7 @@ resource "matchbox_profile" "container-linux-install" {
|
||||
data "template_file" "container-linux-install-configs" {
|
||||
count = length(var.controllers) + length(var.workers)
|
||||
|
||||
template = file("${path.module}/cl/install.yaml.tmpl")
|
||||
template = file("${path.module}/cl/install.yaml")
|
||||
|
||||
vars = {
|
||||
os_flavor = local.flavor
|
||||
@ -72,7 +72,7 @@ resource "matchbox_profile" "cached-container-linux-install" {
|
||||
data "template_file" "cached-container-linux-install-configs" {
|
||||
count = length(var.controllers) + length(var.workers)
|
||||
|
||||
template = file("${path.module}/cl/install.yaml.tmpl")
|
||||
template = file("${path.module}/cl/install.yaml")
|
||||
|
||||
vars = {
|
||||
os_flavor = local.flavor
|
||||
@ -150,7 +150,7 @@ data "ct_config" "controller-ignitions" {
|
||||
data "template_file" "controller-configs" {
|
||||
count = length(var.controllers)
|
||||
|
||||
template = file("${path.module}/cl/controller.yaml.tmpl")
|
||||
template = file("${path.module}/cl/controller.yaml")
|
||||
|
||||
vars = {
|
||||
domain_name = var.controllers.*.domain[count.index]
|
||||
@ -180,7 +180,7 @@ data "ct_config" "worker-ignitions" {
|
||||
data "template_file" "worker-configs" {
|
||||
count = length(var.workers)
|
||||
|
||||
template = file("${path.module}/cl/worker.yaml.tmpl")
|
||||
template = file("${path.module}/cl/worker.yaml")
|
||||
|
||||
vars = {
|
||||
domain_name = var.workers.*.domain[count.index]
|
||||
@ -188,6 +188,8 @@ data "template_file" "worker-configs" {
|
||||
cluster_dns_service_ip = module.bootstrap.cluster_dns_service_ip
|
||||
cluster_domain_suffix = var.cluster_domain_suffix
|
||||
ssh_authorized_key = var.ssh_authorized_key
|
||||
node_labels = join(",", lookup(var.worker_node_labels, var.workers.*.name[count.index], []))
|
||||
node_taints = join(",", lookup(var.worker_node_taints, var.workers.*.name[count.index], []))
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -2,7 +2,7 @@ locals {
|
||||
# format assets for distribution
|
||||
assets_bundle = [
|
||||
# header with the unpack location
|
||||
for key, value in module.bootstrap.assets_dist:
|
||||
for key, value in module.bootstrap.assets_dist :
|
||||
format("##### %s\n%s", key, value)
|
||||
]
|
||||
}
|
||||
|
@ -55,6 +55,18 @@ variable "clc_snippets" {
|
||||
default = {}
|
||||
}
|
||||
|
||||
variable "worker_node_labels" {
|
||||
type = map(list(string))
|
||||
description = "Map from worker names to lists of initial node labels"
|
||||
default = {}
|
||||
}
|
||||
|
||||
variable "worker_node_taints" {
|
||||
type = map(list(string))
|
||||
description = "Map from worker names to lists of initial node taints"
|
||||
default = {}
|
||||
}
|
||||
|
||||
# configuration
|
||||
|
||||
variable "k8s_domain_name" {
|
||||
|
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.17.0 (upstream)
|
||||
* Kubernetes v1.18.0 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=24e5513ee6da4888005aac02101f73fc9e5e829f"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=cb170f802d09dcbe88b050257bc676e25d3c4282"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [var.k8s_domain_name]
|
||||
|
@ -28,7 +28,7 @@ systemd:
|
||||
--network host \
|
||||
--volume /var/lib/etcd:/var/lib/etcd:rw,Z \
|
||||
--volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \
|
||||
quay.io/coreos/etcd:v3.4.3
|
||||
quay.io/coreos/etcd:v3.4.5
|
||||
ExecStop=/usr/bin/podman stop etcd
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
@ -72,16 +72,15 @@ systemd:
|
||||
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \
|
||||
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
|
||||
--volume /etc/pki/tls/certs:/usr/share/ca-certificates:ro \
|
||||
--volume /var/lib/calico:/var/lib/calico \
|
||||
--volume /var/lib/calico:/var/lib/calico:ro \
|
||||
--volume /var/lib/docker:/var/lib/docker \
|
||||
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
|
||||
--volume /var/log:/var/log \
|
||||
--volume /var/run:/var/run \
|
||||
--volume /var/run/lock:/var/run/lock:z \
|
||||
--volume /opt/cni/bin:/opt/cni/bin:z \
|
||||
--volume /etc/iscsi:/etc/iscsi \
|
||||
--volume /sbin/iscsiadm:/sbin/iscsiadm \
|
||||
k8s.gcr.io/hyperkube:v1.17.0 kubelet \
|
||||
quay.io/poseidon/kubelet:v1.18.0 \
|
||||
--anonymous-auth=false \
|
||||
--authentication-token-webhook \
|
||||
--authorization-mode=Webhook \
|
||||
@ -93,6 +92,7 @@ systemd:
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--exit-on-lock-contention \
|
||||
--healthz-port=0 \
|
||||
--hostname-override=${domain_name} \
|
||||
--kubeconfig=/etc/kubernetes/kubeconfig \
|
||||
--lock-file=/var/run/lock/kubelet.lock \
|
||||
@ -127,14 +127,13 @@ systemd:
|
||||
Type=oneshot
|
||||
RemainAfterExit=true
|
||||
WorkingDirectory=/opt/bootstrap
|
||||
ExecStartPre=-/usr/bin/bash -c 'set -x && [ -n "$(ls /opt/bootstrap/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootstrap/assets/manifests-*/* /opt/bootstrap/assets/manifests && rm -rf /opt/bootstrap/assets/manifests-*'
|
||||
ExecStart=/usr/bin/podman run --name bootstrap \
|
||||
--network host \
|
||||
--volume /etc/kubernetes/bootstrap-secrets:/etc/kubernetes/secrets:ro,Z \
|
||||
--volume /opt/bootstrap/assets:/assets:ro,Z \
|
||||
--volume /opt/bootstrap/apply:/apply:ro,Z \
|
||||
--entrypoint=/apply \
|
||||
k8s.gcr.io/hyperkube:v1.17.0
|
||||
quay.io/poseidon/kubelet:v1.18.0
|
||||
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
|
||||
ExecStartPost=-/usr/bin/podman stop bootstrap
|
||||
storage:
|
||||
@ -166,7 +165,9 @@ storage:
|
||||
sudo mv static-manifests/* /etc/kubernetes/manifests/
|
||||
sudo mkdir -p /opt/bootstrap/assets
|
||||
sudo mv manifests /opt/bootstrap/assets/manifests
|
||||
sudo mv manifests-networking /opt/bootstrap/assets/manifests-networking
|
||||
sudo mkdir -p /opt/bootstrap/assets/manifests/crds
|
||||
sudo mv manifests-networking/crd*.yaml /opt/bootstrap/assets/manifests/crds
|
||||
sudo mv manifests-networking/* /opt/bootstrap/assets/manifests/
|
||||
rm -rf assets auth static-manifests tls
|
||||
- path: /opt/bootstrap/apply
|
||||
mode: 0544
|
||||
@ -178,14 +179,14 @@ storage:
|
||||
echo "Waiting for static pod control plane"
|
||||
sleep 5
|
||||
done
|
||||
until kubectl apply -f /assets/manifests/crds -R; do
|
||||
echo "Retry Custom Resource Definition manifests"
|
||||
sleep 5
|
||||
done
|
||||
until kubectl apply -f /assets/manifests -R; do
|
||||
echo "Retry applying manifests"
|
||||
sleep 5
|
||||
done
|
||||
- path: /etc/sysctl.d/reverse-path-filter.conf
|
||||
contents:
|
||||
inline: |
|
||||
net.ipv4.conf.all.rp_filter=1
|
||||
- path: /etc/sysctl.d/max-user-watches.conf
|
||||
contents:
|
||||
inline: |
|
||||
|
@ -42,16 +42,15 @@ systemd:
|
||||
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \
|
||||
--volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \
|
||||
--volume /etc/pki/tls/certs:/usr/share/ca-certificates:ro \
|
||||
--volume /var/lib/calico:/var/lib/calico \
|
||||
--volume /var/lib/calico:/var/lib/calico:ro \
|
||||
--volume /var/lib/docker:/var/lib/docker \
|
||||
--volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \
|
||||
--volume /var/log:/var/log \
|
||||
--volume /var/run:/var/run \
|
||||
--volume /var/run/lock:/var/run/lock:z \
|
||||
--volume /opt/cni/bin:/opt/cni/bin:z \
|
||||
--volume /etc/iscsi:/etc/iscsi \
|
||||
--volume /sbin/iscsiadm:/sbin/iscsiadm \
|
||||
k8s.gcr.io/hyperkube:v1.17.0 kubelet \
|
||||
quay.io/poseidon/kubelet:v1.18.0 \
|
||||
--anonymous-auth=false \
|
||||
--authentication-token-webhook \
|
||||
--authorization-mode=Webhook \
|
||||
@ -63,11 +62,18 @@ systemd:
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--exit-on-lock-contention \
|
||||
--healthz-port=0 \
|
||||
--hostname-override=${domain_name} \
|
||||
--kubeconfig=/etc/kubernetes/kubeconfig \
|
||||
--lock-file=/var/run/lock/kubelet.lock \
|
||||
--network-plugin=cni \
|
||||
--node-labels=node.kubernetes.io/node \
|
||||
%{~ for label in compact(split(",", node_labels)) ~}
|
||||
--node-labels=${label} \
|
||||
%{~ endfor ~}
|
||||
%{~ for taint in compact(split(",", node_taints)) ~}
|
||||
--register-with-taints=${taint} \
|
||||
%{~ endfor ~}
|
||||
--pod-manifest-path=/etc/kubernetes/manifests \
|
||||
--read-only-port=0 \
|
||||
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
|
||||
@ -95,10 +101,6 @@ storage:
|
||||
contents:
|
||||
inline:
|
||||
${domain_name}
|
||||
- path: /etc/sysctl.d/reverse-path-filter.conf
|
||||
contents:
|
||||
inline: |
|
||||
net.ipv4.conf.all.rp_filter=1
|
||||
- path: /etc/sysctl.d/max-user-watches.conf
|
||||
contents:
|
||||
inline: |
|
||||
|
@ -1,24 +1,28 @@
|
||||
locals {
|
||||
remote_kernel = "https://builds.coreos.fedoraproject.org/prod/streams/${var.os_stream}/builds/${var.os_version}/x86_64/fedora-coreos-${var.os_version}-installer-kernel-x86_64"
|
||||
remote_initrd = "https://builds.coreos.fedoraproject.org/prod/streams/${var.os_stream}/builds/${var.os_version}/x86_64/fedora-coreos-${var.os_version}-installer-initramfs.x86_64.img"
|
||||
remote_kernel = "https://builds.coreos.fedoraproject.org/prod/streams/${var.os_stream}/builds/${var.os_version}/x86_64/fedora-coreos-${var.os_version}-live-kernel-x86_64"
|
||||
remote_initrd = "https://builds.coreos.fedoraproject.org/prod/streams/${var.os_stream}/builds/${var.os_version}/x86_64/fedora-coreos-${var.os_version}-live-initramfs.x86_64.img"
|
||||
remote_args = [
|
||||
"ip=dhcp",
|
||||
"rd.neednet=1",
|
||||
"coreos.inst=yes",
|
||||
"initrd=fedora-coreos-${var.os_version}-live-initramfs.x86_64.img",
|
||||
"coreos.inst.image_url=https://builds.coreos.fedoraproject.org/prod/streams/${var.os_stream}/builds/${var.os_version}/x86_64/fedora-coreos-${var.os_version}-metal.x86_64.raw.xz",
|
||||
"coreos.inst.ignition_url=${var.matchbox_http_endpoint}/ignition?uuid=$${uuid}&mac=$${mac:hexhyp}",
|
||||
"coreos.inst.install_dev=${var.install_disk}"
|
||||
"coreos.inst.install_dev=${var.install_disk}",
|
||||
"console=tty0",
|
||||
"console=ttyS0",
|
||||
]
|
||||
|
||||
cached_kernel = "/assets/fedora-coreos/fedora-coreos-${var.os_version}-installer-kernel-x86_64"
|
||||
cached_initrd = "/assets/fedora-coreos/fedora-coreos-${var.os_version}-installer-initramfs.x86_64.img"
|
||||
cached_kernel = "/assets/fedora-coreos/fedora-coreos-${var.os_version}-live-kernel-x86_64"
|
||||
cached_initrd = "/assets/fedora-coreos/fedora-coreos-${var.os_version}-live-initramfs.x86_64.img"
|
||||
cached_args = [
|
||||
"ip=dhcp",
|
||||
"rd.neednet=1",
|
||||
"coreos.inst=yes",
|
||||
"initrd=fedora-coreos-${var.os_version}-live-initramfs.x86_64.img",
|
||||
"coreos.inst.image_url=${var.matchbox_http_endpoint}/assets/fedora-coreos/fedora-coreos-${var.os_version}-metal.x86_64.raw.xz",
|
||||
"coreos.inst.ignition_url=${var.matchbox_http_endpoint}/ignition?uuid=$${uuid}&mac=$${mac:hexhyp}",
|
||||
"coreos.inst.install_dev=${var.install_disk}"
|
||||
"coreos.inst.install_dev=${var.install_disk}",
|
||||
"console=tty0",
|
||||
"console=ttyS0",
|
||||
]
|
||||
|
||||
kernel = var.cached_install ? local.cached_kernel : local.remote_kernel
|
||||
@ -46,6 +50,7 @@ data "ct_config" "controller-ignitions" {
|
||||
|
||||
content = data.template_file.controller-configs.*.rendered[count.index]
|
||||
strict = true
|
||||
snippets = lookup(var.snippets, var.controllers.*.name[count.index], [])
|
||||
}
|
||||
|
||||
data "template_file" "controller-configs" {
|
||||
@ -81,6 +86,7 @@ data "ct_config" "worker-ignitions" {
|
||||
|
||||
content = data.template_file.worker-configs.*.rendered[count.index]
|
||||
strict = true
|
||||
snippets = lookup(var.snippets, var.workers.*.name[count.index], [])
|
||||
}
|
||||
|
||||
data "template_file" "worker-configs" {
|
||||
@ -92,6 +98,8 @@ data "template_file" "worker-configs" {
|
||||
cluster_dns_service_ip = module.bootstrap.cluster_dns_service_ip
|
||||
cluster_domain_suffix = var.cluster_domain_suffix
|
||||
ssh_authorized_key = var.ssh_authorized_key
|
||||
node_labels = join(",", lookup(var.worker_node_labels, var.workers.*.name[count.index], []))
|
||||
node_taints = join(",", lookup(var.worker_node_taints, var.workers.*.name[count.index], []))
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -2,7 +2,7 @@ locals {
|
||||
# format assets for distribution
|
||||
assets_bundle = [
|
||||
# header with the unpack location
|
||||
for key, value in module.bootstrap.assets_dist:
|
||||
for key, value in module.bootstrap.assets_dist :
|
||||
format("##### %s\n%s", key, value)
|
||||
]
|
||||
}
|
||||
|
@ -13,12 +13,12 @@ variable "matchbox_http_endpoint" {
|
||||
variable "os_stream" {
|
||||
type = string
|
||||
description = "Fedora CoreOS release stream (e.g. testing, stable)"
|
||||
default = "testing"
|
||||
default = "stable"
|
||||
}
|
||||
|
||||
variable "os_version" {
|
||||
type = string
|
||||
description = "Fedora CoreOS version to PXE and install (e.g. 30.20190712.0)"
|
||||
description = "Fedora CoreOS version to PXE and install (e.g. 31.20200310.3.0)"
|
||||
}
|
||||
|
||||
# machines
|
||||
@ -56,6 +56,18 @@ variable "snippets" {
|
||||
default = {}
|
||||
}
|
||||
|
||||
variable "worker_node_labels" {
|
||||
type = map(list(string))
|
||||
description = "Map from worker names to lists of initial node labels"
|
||||
default = {}
|
||||
}
|
||||
|
||||
variable "worker_node_taints" {
|
||||
type = map(list(string))
|
||||
description = "Map from worker names to lists of initial node taints"
|
||||
default = {}
|
||||
}
|
||||
|
||||
# configuration
|
||||
|
||||
variable "k8s_domain_name" {
|
||||
|
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
|
||||
|
||||
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
|
||||
|
||||
* Kubernetes v1.17.0 (upstream)
|
||||
* Kubernetes v1.18.0 (upstream)
|
||||
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
|
||||
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
|
||||
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Kubernetes assets (kubeconfig, manifests)
|
||||
module "bootstrap" {
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=24e5513ee6da4888005aac02101f73fc9e5e829f"
|
||||
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=cb170f802d09dcbe88b050257bc676e25d3c4282"
|
||||
|
||||
cluster_name = var.cluster_name
|
||||
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
|
||||
|
@ -7,7 +7,9 @@ systemd:
|
||||
- name: 40-etcd-cluster.conf
|
||||
contents: |
|
||||
[Service]
|
||||
Environment="ETCD_IMAGE_TAG=v3.4.3"
|
||||
Environment="ETCD_IMAGE_TAG=v3.4.5"
|
||||
Environment="ETCD_IMAGE_URL=docker://quay.io/coreos/etcd"
|
||||
Environment="RKT_RUN_ARGS=--insecure-options=image"
|
||||
Environment="ETCD_NAME=${etcd_name}"
|
||||
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
|
||||
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
|
||||
@ -60,29 +62,46 @@ systemd:
|
||||
After=coreos-metadata.service
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
EnvironmentFile=/etc/kubernetes/kubelet.env
|
||||
EnvironmentFile=/run/metadata/coreos
|
||||
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/cache/kubelet-pod.uuid \
|
||||
--volume=resolv,kind=host,source=/etc/resolv.conf \
|
||||
--mount volume=resolv,target=/etc/resolv.conf \
|
||||
--volume var-lib-cni,kind=host,source=/var/lib/cni \
|
||||
--mount volume=var-lib-cni,target=/var/lib/cni \
|
||||
--volume var-lib-calico,kind=host,source=/var/lib/calico \
|
||||
--mount volume=var-lib-calico,target=/var/lib/calico \
|
||||
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
|
||||
--mount volume=opt-cni-bin,target=/opt/cni/bin \
|
||||
--volume var-log,kind=host,source=/var/log \
|
||||
--mount volume=var-log,target=/var/log \
|
||||
--insecure-options=image"
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/cni
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
|
||||
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
|
||||
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
|
||||
ExecStart=/usr/lib/coreos/kubelet-wrapper \
|
||||
ExecStart=/usr/bin/rkt run \
|
||||
--uuid-file-save=/var/cache/kubelet-pod.uuid \
|
||||
--stage1-from-dir=stage1-fly.aci \
|
||||
--hosts-entry host \
|
||||
--insecure-options=image \
|
||||
--volume etc-kubernetes,kind=host,source=/etc/kubernetes,readOnly=true \
|
||||
--mount volume=etc-kubernetes,target=/etc/kubernetes \
|
||||
--volume etc-machine-id,kind=host,source=/etc/machine-id,readOnly=true \
|
||||
--mount volume=etc-machine-id,target=/etc/machine-id \
|
||||
--volume etc-os-release,kind=host,source=/usr/lib/os-release,readOnly=true \
|
||||
--mount volume=etc-os-release,target=/etc/os-release \
|
||||
--volume=etc-resolv,kind=host,source=/etc/resolv.conf,readOnly=true \
|
||||
--mount volume=etc-resolv,target=/etc/resolv.conf \
|
||||
--volume etc-ssl-certs,kind=host,source=/etc/ssl/certs,readOnly=true \
|
||||
--mount volume=etc-ssl-certs,target=/etc/ssl/certs \
|
||||
--volume lib-modules,kind=host,source=/lib/modules,readOnly=true \
|
||||
--mount volume=lib-modules,target=/lib/modules \
|
||||
--volume run,kind=host,source=/run \
|
||||
--mount volume=run,target=/run \
|
||||
--volume usr-share-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true \
|
||||
--mount volume=usr-share-certs,target=/usr/share/ca-certificates \
|
||||
--volume var-lib-calico,kind=host,source=/var/lib/calico,readOnly=true \
|
||||
--mount volume=var-lib-calico,target=/var/lib/calico \
|
||||
--volume var-lib-docker,kind=host,source=/var/lib/docker \
|
||||
--mount volume=var-lib-docker,target=/var/lib/docker \
|
||||
--volume var-lib-kubelet,kind=host,source=/var/lib/kubelet,recursive=true \
|
||||
--mount volume=var-lib-kubelet,target=/var/lib/kubelet \
|
||||
--volume var-log,kind=host,source=/var/log \
|
||||
--mount volume=var-log,target=/var/log \
|
||||
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
|
||||
--mount volume=opt-cni-bin,target=/opt/cni/bin \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.0 -- \
|
||||
--anonymous-auth=false \
|
||||
--authentication-token-webhook \
|
||||
--authorization-mode=Webhook \
|
||||
@ -91,6 +110,7 @@ systemd:
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--exit-on-lock-contention \
|
||||
--healthz-port=0 \
|
||||
--hostname-override=$${COREOS_DIGITALOCEAN_IPV4_PRIVATE_0} \
|
||||
--kubeconfig=/etc/kubernetes/kubeconfig \
|
||||
--lock-file=/var/run/lock/kubelet.lock \
|
||||
@ -115,7 +135,6 @@ systemd:
|
||||
Type=oneshot
|
||||
RemainAfterExit=true
|
||||
WorkingDirectory=/opt/bootstrap
|
||||
ExecStartPre=-/usr/bin/bash -c 'set -x && [ -n "$(ls /opt/bootstrap/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootstrap/assets/manifests-*/* /opt/bootstrap/assets/manifests && rm -rf /opt/bootstrap/assets/manifests-*'
|
||||
ExecStart=/usr/bin/rkt run \
|
||||
--trust-keys-from-https \
|
||||
--volume config,kind=host,source=/etc/kubernetes/bootstrap-secrets \
|
||||
@ -125,7 +144,7 @@ systemd:
|
||||
--volume script,kind=host,source=/opt/bootstrap/apply \
|
||||
--mount volume=script,target=/apply \
|
||||
--insecure-options=image \
|
||||
docker://k8s.gcr.io/hyperkube:v1.17.0 \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.0 \
|
||||
--net=host \
|
||||
--dns=host \
|
||||
--exec=/apply
|
||||
@ -133,15 +152,10 @@ systemd:
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
storage:
|
||||
files:
|
||||
- path: /etc/kubernetes/kubelet.env
|
||||
directories:
|
||||
- path: /etc/kubernetes
|
||||
filesystem: root
|
||||
mode: 0644
|
||||
contents:
|
||||
inline: |
|
||||
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
|
||||
KUBELET_IMAGE_TAG=v1.17.0
|
||||
KUBELET_IMAGE_ARGS="--exec=/usr/local/bin/kubelet"
|
||||
files:
|
||||
- path: /opt/bootstrap/layout
|
||||
filesystem: root
|
||||
mode: 0544
|
||||
@ -162,7 +176,9 @@ storage:
|
||||
sudo mv static-manifests/* /etc/kubernetes/manifests/
|
||||
sudo mkdir -p /opt/bootstrap/assets
|
||||
sudo mv manifests /opt/bootstrap/assets/manifests
|
||||
sudo mv manifests-networking /opt/bootstrap/assets/manifests-networking
|
||||
sudo mkdir -p /opt/bootstrap/assets/manifests/crds
|
||||
sudo mv manifests-networking/crd*.yaml /opt/bootstrap/assets/manifests/crds
|
||||
sudo mv manifests-networking/* /opt/bootstrap/assets/manifests/
|
||||
rm -rf assets auth static-manifests tls
|
||||
- path: /opt/bootstrap/apply
|
||||
filesystem: root
|
||||
@ -175,6 +191,10 @@ storage:
|
||||
echo "Waiting for static pod control plane"
|
||||
sleep 5
|
||||
done
|
||||
until kubectl apply -f /assets/manifests/crds -R; do
|
||||
echo "Retry Custom Resource Definition manifests"
|
||||
sleep 5
|
||||
done
|
||||
until kubectl apply -f /assets/manifests -R; do
|
||||
echo "Retry applying manifests"
|
||||
sleep 5
|
@ -35,29 +35,46 @@ systemd:
|
||||
After=coreos-metadata.service
|
||||
Wants=rpc-statd.service
|
||||
[Service]
|
||||
EnvironmentFile=/etc/kubernetes/kubelet.env
|
||||
EnvironmentFile=/run/metadata/coreos
|
||||
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/cache/kubelet-pod.uuid \
|
||||
--volume=resolv,kind=host,source=/etc/resolv.conf \
|
||||
--mount volume=resolv,target=/etc/resolv.conf \
|
||||
--volume var-lib-cni,kind=host,source=/var/lib/cni \
|
||||
--mount volume=var-lib-cni,target=/var/lib/cni \
|
||||
--volume var-lib-calico,kind=host,source=/var/lib/calico \
|
||||
--mount volume=var-lib-calico,target=/var/lib/calico \
|
||||
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
|
||||
--mount volume=opt-cni-bin,target=/opt/cni/bin \
|
||||
--volume var-log,kind=host,source=/var/log \
|
||||
--mount volume=var-log,target=/var/log \
|
||||
--insecure-options=image"
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
|
||||
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
|
||||
ExecStartPre=/bin/mkdir -p /opt/cni/bin
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/cni
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/calico
|
||||
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
|
||||
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
|
||||
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
|
||||
ExecStart=/usr/lib/coreos/kubelet-wrapper \
|
||||
ExecStart=/usr/bin/rkt run \
|
||||
--uuid-file-save=/var/cache/kubelet-pod.uuid \
|
||||
--stage1-from-dir=stage1-fly.aci \
|
||||
--hosts-entry host \
|
||||
--insecure-options=image \
|
||||
--volume etc-kubernetes,kind=host,source=/etc/kubernetes,readOnly=true \
|
||||
--mount volume=etc-kubernetes,target=/etc/kubernetes \
|
||||
--volume etc-machine-id,kind=host,source=/etc/machine-id,readOnly=true \
|
||||
--mount volume=etc-machine-id,target=/etc/machine-id \
|
||||
--volume etc-os-release,kind=host,source=/usr/lib/os-release,readOnly=true \
|
||||
--mount volume=etc-os-release,target=/etc/os-release \
|
||||
--volume=etc-resolv,kind=host,source=/etc/resolv.conf,readOnly=true \
|
||||
--mount volume=etc-resolv,target=/etc/resolv.conf \
|
||||
--volume etc-ssl-certs,kind=host,source=/etc/ssl/certs,readOnly=true \
|
||||
--mount volume=etc-ssl-certs,target=/etc/ssl/certs \
|
||||
--volume lib-modules,kind=host,source=/lib/modules,readOnly=true \
|
||||
--mount volume=lib-modules,target=/lib/modules \
|
||||
--volume run,kind=host,source=/run \
|
||||
--mount volume=run,target=/run \
|
||||
--volume usr-share-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true \
|
||||
--mount volume=usr-share-certs,target=/usr/share/ca-certificates \
|
||||
--volume var-lib-calico,kind=host,source=/var/lib/calico,readOnly=true \
|
||||
--mount volume=var-lib-calico,target=/var/lib/calico \
|
||||
--volume var-lib-docker,kind=host,source=/var/lib/docker \
|
||||
--mount volume=var-lib-docker,target=/var/lib/docker \
|
||||
--volume var-lib-kubelet,kind=host,source=/var/lib/kubelet,recursive=true \
|
||||
--mount volume=var-lib-kubelet,target=/var/lib/kubelet \
|
||||
--volume var-log,kind=host,source=/var/log \
|
||||
--mount volume=var-log,target=/var/log \
|
||||
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
|
||||
--mount volume=opt-cni-bin,target=/opt/cni/bin \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.0 -- \
|
||||
--anonymous-auth=false \
|
||||
--authentication-token-webhook \
|
||||
--authorization-mode=Webhook \
|
||||
@ -66,6 +83,7 @@ systemd:
|
||||
--cluster_domain=${cluster_domain_suffix} \
|
||||
--cni-conf-dir=/etc/kubernetes/cni/net.d \
|
||||
--exit-on-lock-contention \
|
||||
--healthz-port=0 \
|
||||
--hostname-override=$${COREOS_DIGITALOCEAN_IPV4_PRIVATE_0} \
|
||||
--kubeconfig=/etc/kubernetes/kubeconfig \
|
||||
--lock-file=/var/run/lock/kubelet.lock \
|
||||
@ -92,15 +110,10 @@ systemd:
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
storage:
|
||||
files:
|
||||
- path: /etc/kubernetes/kubelet.env
|
||||
directories:
|
||||
- path: /etc/kubernetes
|
||||
filesystem: root
|
||||
mode: 0644
|
||||
contents:
|
||||
inline: |
|
||||
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
|
||||
KUBELET_IMAGE_TAG=v1.17.0
|
||||
KUBELET_IMAGE_ARGS="--exec=/usr/local/bin/kubelet"
|
||||
files:
|
||||
- path: /etc/sysctl.d/max-user-watches.conf
|
||||
filesystem: root
|
||||
contents:
|
||||
@ -118,8 +131,7 @@ storage:
|
||||
--volume config,kind=host,source=/etc/kubernetes \
|
||||
--mount volume=config,target=/etc/kubernetes \
|
||||
--insecure-options=image \
|
||||
docker://k8s.gcr.io/hyperkube:v1.17.0 \
|
||||
docker://quay.io/poseidon/kubelet:v1.18.0 \
|
||||
--net=host \
|
||||
--dns=host \
|
||||
-- \
|
||||
kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)
|
||||
--exec=/usr/local/bin/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)
|
@ -1,3 +1,8 @@
|
||||
locals {
|
||||
official_images = ["coreos-stable", "coreos-beta", "coreos-alpha"]
|
||||
is_official_image = contains(local.official_images, var.os_image)
|
||||
}
|
||||
|
||||
# Controller Instance DNS records
|
||||
resource "digitalocean_record" "controllers" {
|
||||
count = var.controller_count
|
||||
@ -11,7 +16,7 @@ resource "digitalocean_record" "controllers" {
|
||||
ttl = 300
|
||||
|
||||
# IPv4 addresses of controllers
|
||||
value = element(digitalocean_droplet.controllers.*.ipv4_address, count.index)
|
||||
value = digitalocean_droplet.controllers.*.ipv4_address[count.index]
|
||||
}
|
||||
|
||||
# Discrete DNS records for each controller's private IPv4 for etcd usage
|
||||
@ -27,7 +32,7 @@ resource "digitalocean_record" "etcds" {
|
||||
ttl = 300
|
||||
|
||||
# private IPv4 address for etcd
|
||||
value = element(digitalocean_droplet.controllers.*.ipv4_address_private, count.index)
|
||||
value = digitalocean_droplet.controllers.*.ipv4_address_private[count.index]
|
||||
}
|
||||
|
||||
# Controller droplet instances
|
||||
@ -37,14 +42,15 @@ resource "digitalocean_droplet" "controllers" {
|
||||
name = "${var.cluster_name}-controller-${count.index}"
|
||||
region = var.region
|
||||
|
||||
image = var.image
|
||||
image = var.os_image
|
||||
size = var.controller_type
|
||||
|
||||
# network
|
||||
ipv6 = true
|
||||
# only official DigitalOcean images support IPv6
|
||||
ipv6 = local.is_official_image
|
||||
private_networking = true
|
||||
|
||||
user_data = element(data.ct_config.controller-ignitions.*.rendered, count.index)
|
||||
user_data = data.ct_config.controller-ignitions.*.rendered[count.index]
|
||||
ssh_keys = var.ssh_fingerprints
|
||||
|
||||
tags = [
|
||||
@ -64,7 +70,7 @@ resource "digitalocean_tag" "controllers" {
|
||||
# Controller Ignition configs
|
||||
data "ct_config" "controller-ignitions" {
|
||||
count = var.controller_count
|
||||
content = element(data.template_file.controller-configs.*.rendered, count.index)
|
||||
content = data.template_file.controller-configs.*.rendered[count.index]
|
||||
pretty_print = false
|
||||
snippets = var.controller_clc_snippets
|
||||
}
|
||||
@ -73,7 +79,7 @@ data "ct_config" "controller-ignitions" {
|
||||
data "template_file" "controller-configs" {
|
||||
count = var.controller_count
|
||||
|
||||
template = file("${path.module}/cl/controller.yaml.tmpl")
|
||||
template = file("${path.module}/cl/controller.yaml")
|
||||
|
||||
vars = {
|
||||
# Cannot use cyclic dependencies on controllers or their DNS records
|
||||
|
@ -16,12 +16,20 @@ resource "digitalocean_firewall" "rules" {
|
||||
source_tags = [digitalocean_tag.controllers.name, digitalocean_tag.workers.name]
|
||||
}
|
||||
|
||||
# Allow Prometheus to scrape node-exporter
|
||||
inbound_rule {
|
||||
protocol = "tcp"
|
||||
port_range = "9100"
|
||||
source_tags = [digitalocean_tag.workers.name]
|
||||
}
|
||||
|
||||
# Allow Prometheus to scrape kube-proxy
|
||||
inbound_rule {
|
||||
protocol = "tcp"
|
||||
port_range = "10249"
|
||||
source_tags = [digitalocean_tag.workers.name]
|
||||
}
|
||||
|
||||
inbound_rule {
|
||||
protocol = "tcp"
|
||||
port_range = "10250"
|
||||
|
@ -27,6 +27,12 @@ output "workers_ipv6" {
|
||||
value = digitalocean_droplet.workers.*.ipv6_address
|
||||
}
|
||||
|
||||
# Outputs for worker pools
|
||||
|
||||
output "kubeconfig" {
|
||||
value = module.bootstrap.kubeconfig-kubelet
|
||||
}
|
||||
|
||||
# Outputs for custom firewalls
|
||||
|
||||
output "controller_tag" {
|
||||
|
@ -2,7 +2,7 @@ locals {
|
||||
# format assets for distribution
|
||||
assets_bundle = [
|
||||
# header with the unpack location
|
||||
for key, value in module.bootstrap.assets_dist:
|
||||
for key, value in module.bootstrap.assets_dist :
|
||||
format("##### %s\n%s", key, value)
|
||||
]
|
||||
}
|
||||
|
@ -41,7 +41,7 @@ variable "worker_type" {
|
||||
default = "s-1vcpu-2gb"
|
||||
}
|
||||
|
||||
variable "image" {
|
||||
variable "os_image" {
|
||||
type = string
|
||||
description = "Container Linux image for instances (e.g. coreos-stable)"
|
||||
default = "coreos-stable"
|
||||
|
@ -8,11 +8,12 @@ resource "digitalocean_record" "workers-record-a" {
|
||||
name = "${var.cluster_name}-workers"
|
||||
type = "A"
|
||||
ttl = 300
|
||||
value = element(digitalocean_droplet.workers.*.ipv4_address, count.index)
|
||||
value = digitalocean_droplet.workers.*.ipv4_address[count.index]
|
||||
}
|
||||
|
||||
resource "digitalocean_record" "workers-record-aaaa" {
|
||||
count = var.worker_count
|
||||
# only official DigitalOcean images support IPv6
|
||||
count = local.is_official_image ? var.worker_count : 0
|
||||
|
||||
# DNS zone where record should be created
|
||||
domain = var.dns_zone
|
||||
@ -20,7 +21,7 @@ resource "digitalocean_record" "workers-record-aaaa" {
|
||||
name = "${var.cluster_name}-workers"
|
||||
type = "AAAA"
|
||||
ttl = 300
|
||||
value = element(digitalocean_droplet.workers.*.ipv6_address, count.index)
|
||||
value = digitalocean_droplet.workers.*.ipv6_address[count.index]
|
||||
}
|
||||
|
||||
# Worker droplet instances
|
||||
@ -30,11 +31,12 @@ resource "digitalocean_droplet" "workers" {
|
||||
name = "${var.cluster_name}-worker-${count.index}"
|
||||
region = var.region
|
||||
|
||||
image = var.image
|
||||
image = var.os_image
|
||||
size = var.worker_type
|
||||
|
||||
# network
|
||||
ipv6 = true
|
||||
# only official DigitalOcean images support IPv6
|
||||
ipv6 = local.is_official_image
|
||||
private_networking = true
|
||||
|
||||
user_data = data.ct_config.worker-ignition.rendered
|
||||
@ -63,7 +65,7 @@ data "ct_config" "worker-ignition" {
|
||||
|
||||
# Worker Container Linux config
|
||||
data "template_file" "worker-config" {
|
||||
template = file("${path.module}/cl/worker.yaml.tmpl")
|
||||
template = file("${path.module}/cl/worker.yaml")
|
||||
|
||||
vars = {
|
||||
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
|
||||
|
@ -1,29 +0,0 @@
|
||||
# Container Linux Update Operator
|
||||
|
||||
The [Container Linux Update Operator](https://github.com/coreos/container-linux-update-operator) (i.e. CLUO) coordinates reboots of auto-updating Container Linux nodes so that one node reboots at a time and nodes are drained before reboot. CLUO enables the auto-update behavior Container Linux clusters are known for, but does so in a Kubernetes native way.
|
||||
|
||||
## Create
|
||||
|
||||
Create the `update-operator` deployment and `update-agent` DaemonSet.
|
||||
|
||||
```sh
|
||||
kubectl apply -f addons/cluo -R
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
`update-agent` runs as a DaemonSet and annotates a node when `update-engine.service` indicates an update has been installed and a reboot is needed. It also adds additional labels and annotations to nodes.
|
||||
|
||||
```
|
||||
$ kubectl get nodes --show-labels
|
||||
...
|
||||
container-linux-update.v1.coreos.com/group=stable
|
||||
container-linux-update.v1.coreos.com/version=1632.3.0
|
||||
```
|
||||
|
||||
`update-operator` ensures one node reboots at a time and that pods are drained prior to reboot.
|
||||
|
||||
!!! note ""
|
||||
CLUO replaces `locksmithd` reboot coordination. The `update_engine` systemd unit on hosts still performs the Container Linux update check, download, and install to the inactive partition.
|
||||
|
||||
|
@ -2,7 +2,6 @@
|
||||
|
||||
Every Typhoon cluster is verified to work well with several post-install addons.
|
||||
|
||||
* [CLUO](cluo.md) (Container Linux only)
|
||||
* Nginx [Ingress Controller](ingress.md)
|
||||
* [Prometheus](prometheus.md)
|
||||
* [Grafana](grafana.md)
|
||||
|
@ -8,24 +8,88 @@ Typhoon modules accept Terraform input variables for customizing clusters in mer
|
||||
|
||||
## Addons
|
||||
|
||||
Clusters are kept to a minimal Kubernetes control plane by offering components like Nginx Ingress Controller, Prometheus, Grafana, and Heapster as optional post-install [addons](https://github.com/poseidon/typhoon/tree/master/addons). Customize addons by modifying a copy of our addon manifests.
|
||||
Clusters are kept to a minimal Kubernetes control plane by offering components like Nginx Ingress Controller, Prometheus, and Grafana as optional post-install [addons](https://github.com/poseidon/typhoon/tree/master/addons). Customize addons by modifying a copy of our addon manifests.
|
||||
|
||||
## Hosts
|
||||
|
||||
### Container Linux
|
||||
Typhoon uses the [Ignition](https://github.com/coreos/ignition) system of Fedora CoreOS and Flatcar Linux to immutably declare a system via first-boot disk provisioning. Fedora CoreOS uses a [Fedora CoreOS Config](https://docs.fedoraproject.org/en-US/fedora-coreos/fcct-config/) (FCC) and Flatcar Linux uses a [Container Linux Config](https://github.com/coreos/container-linux-config-transpiler/blob/master/doc/examples.md) (CLC). These define disk partitions, filesystems, systemd units, dropins, config files, mount units, raid arrays, and users.
|
||||
|
||||
Controller and worker instances form a minimal and secure Kubernetes cluster on each platform. Typhoon provides the **snippets** feature to accept Fedora CoreOS Configs or Container Linux Configs to validate and additively merge into instance declarations. This allows advanced host customization and experimentation.
|
||||
|
||||
!!! note
|
||||
Snippets cannot be used to modify an already existing instance, the antithesis of immutable provisioning. Ignition fully declares a system on first boot only.
|
||||
|
||||
!!! danger
|
||||
Container Linux Configs provide powerful host customization abilities. You are responsible for the additional configs defined for hosts.
|
||||
Snippets provide the powerful host customization abilities of Ignition. You are responsible for additional units, configs, files, and conflicts.
|
||||
|
||||
Container Linux Configs (CLCs) declare how a Container Linux instance's disk should be provisioned on first boot from disk. CLCs define disk partitions, filesystems, files, systemd units, dropins, networkd configs, mount units, raid arrays, and users. Typhoon creates controller and worker instances with base Container Linux Configs to create a minimal, secure Kubernetes cluster on each platform.
|
||||
!!! danger
|
||||
Edits to snippets for controller instances can (correctly) cause Terraform to observe a diff (if not otherwise suppressed) and propose destroying and recreating controller(s). Recognize that this is destructive since controllers run etcd and are stateful. See [blue/green](/topics/maintenance/#upgrades) clusters.
|
||||
|
||||
Typhoon AWS, Azure, bare-metal, DigitalOcean, and Google Cloud support CLC *snippets* - valid Container Linux Configs that are validated and additively merged into the Typhoon base config during `terraform plan`. This allows advanced host customizations and experimentation.
|
||||
### Fedora CoreOS
|
||||
|
||||
#### Examples
|
||||
!!! note
|
||||
Fedora CoreOS snippets require `terraform-provider-ct` v0.5+
|
||||
|
||||
Container Linux [docs](https://coreos.com/os/docs/latest/clc-examples.html) show many simple config examples. Ensure a file `/opt/hello` is created with permissions 0644.
|
||||
Define a Fedora CoreOS Config (FCC) ([docs](https://docs.fedoraproject.org/en-US/fedora-coreos/fcct-config/), [config](https://github.com/coreos/fcct/blob/master/docs/configuration-v1_0.md), [examples](https://github.com/coreos/fcct/blob/master/docs/examples.md)) in version control near your Terraform workspace directory (e.g. perhaps in a `snippets` subdirectory). You may organize snippets into multiple files, if desired.
|
||||
|
||||
For example, ensure an `/opt/hello` file is created with permissions 0644.
|
||||
|
||||
```yaml
|
||||
# custom-files
|
||||
variant: fcos
|
||||
version: 1.0.0
|
||||
storage:
|
||||
files:
|
||||
- path: /opt/hello
|
||||
contents:
|
||||
inline: |
|
||||
Hello World
|
||||
mode: 0644
|
||||
```
|
||||
|
||||
Reference the FCC contents by location (e.g. `file("./custom-units.yaml")`). On [AWS](/fedora-coreos/aws/#cluster) or [Google Cloud](/fedora-coreos/google-cloud/#cluster) extend the `controller_snippets` or `worker_snippets` list variables.
|
||||
|
||||
```tf
|
||||
module "nemo" {
|
||||
...
|
||||
|
||||
controller_count = 1
|
||||
worker_count = 2
|
||||
controller_snippets = [
|
||||
file("./custom-files"),
|
||||
file("./custom-units"),
|
||||
]
|
||||
worker_snippets = [
|
||||
file("./custom-files"),
|
||||
file("./custom-units")",
|
||||
]
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
On [Bare-Metal](/fedora-coreos/bare-metal/#cluster), different FCCs may be used for each node (since hardware may be heterogeneous). Extend the `snippets` map variable by mapping a controller or worker name key to a list of snippets.
|
||||
|
||||
```tf
|
||||
module "mercury" {
|
||||
...
|
||||
snippets = {
|
||||
"node2" = [file("./units/hello.yaml")]
|
||||
"node3" = [
|
||||
file("./units/world.yaml"),
|
||||
file("./units/hello.yaml"),
|
||||
]
|
||||
}
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
### Container Linux
|
||||
|
||||
Define a Container Linux Config (CLC) ([config](https://github.com/coreos/container-linux-config-transpiler/blob/master/doc/configuration.md), [examples](https://github.com/coreos/container-linux-config-transpiler/blob/master/doc/examples.md)) in version control near your Terraform workspace directory (e.g. perhaps in a `snippets` subdirectory). You may organize snippets into multiple files, if desired.
|
||||
|
||||
For example, ensure an `/opt/hello` file is created with permissions 0644.
|
||||
|
||||
```yaml
|
||||
# custom-files
|
||||
storage:
|
||||
files:
|
||||
@ -37,9 +101,9 @@ storage:
|
||||
mode: 0644
|
||||
```
|
||||
|
||||
Ensure a systemd unit `hello.service` is created and a dropin `50-etcd-cluster.conf` is added for `etcd-member.service`.
|
||||
Or ensure a systemd unit `hello.service` is created and a dropin `50-etcd-cluster.conf` is added for `etcd-member.service`.
|
||||
|
||||
```
|
||||
```yaml
|
||||
# custom-units
|
||||
systemd:
|
||||
units:
|
||||
@ -61,17 +125,9 @@ systemd:
|
||||
Environment="ETCD_LOG_PACKAGE_LEVELS=etcdserver=WARNING,security=DEBUG"
|
||||
```
|
||||
|
||||
#### Specification
|
||||
Reference the CLC contents by location (e.g. `file("./custom-units.yaml")`). On [AWS](/cl/aws/#cluster), [Azure](/cl/azure/#cluster), [DigitalOcean](/cl/digital-ocean/#cluster), or [Google Cloud](/cl/google-cloud/#cluster) extend the `controller_clc_snippets` or `worker_clc_snippets` list variables.
|
||||
|
||||
View the Container Linux Config [format](https://coreos.com/os/docs/1576.4.0/configuration.html) to read about each field.
|
||||
|
||||
#### Usage
|
||||
|
||||
Write Container Linux Configs *snippets* as files in the repository where you keep Terraform configs for clusters (perhaps in a `clc` or `snippets` subdirectory). You may organize snippets in multiple files as desired, provided they are each valid.
|
||||
|
||||
[AWS](/cl/aws/#cluster), [Azure](/cl/azure/#cluster), [DigitalOcean](/cl/digital-ocean/#cluster), and [Google Cloud](/cl/google-cloud/#cluster) clusters allow populating a list of `controller_clc_snippets` or `worker_clc_snippets`.
|
||||
|
||||
```
|
||||
```tf
|
||||
module "nemo" {
|
||||
...
|
||||
|
||||
@ -89,16 +145,11 @@ module "nemo" {
|
||||
}
|
||||
```
|
||||
|
||||
[Bare-Metal](/cl/bare-metal/#cluster) clusters allow different Container Linux snippets to be used for each node (since hardware may be heterogeneous). Populate the optional `clc_snippets` map variable with any controller or worker name keys and lists of snippets.
|
||||
On [Bare-Metal](/cl/bare-metal/#cluster), different CLCs may be used for each node (since hardware may be heterogeneous). Extend the `clc_snippets` map variable by mapping a controller or worker name key to a list of snippets.
|
||||
|
||||
```
|
||||
```tf
|
||||
module "mercury" {
|
||||
...
|
||||
controller_names = ["node1"]
|
||||
worker_names = [
|
||||
"node2",
|
||||
"node3",
|
||||
]
|
||||
clc_snippets = {
|
||||
"node2" = [file("./units/hello.yaml")]
|
||||
"node3" = [
|
||||
@ -110,32 +161,6 @@ module "mercury" {
|
||||
}
|
||||
```
|
||||
|
||||
Plan the resources to be created.
|
||||
|
||||
```
|
||||
$ terraform plan
|
||||
Plan: 54 to add, 0 to change, 0 to destroy.
|
||||
```
|
||||
|
||||
Most syntax errors in CLCs can be caught during planning. For example, mangle the indentation in one of the CLC files:
|
||||
|
||||
```
|
||||
$ terraform plan
|
||||
...
|
||||
error parsing Container Linux Config: error: yaml: line 3: did not find expected '-' indicator
|
||||
```
|
||||
|
||||
Undo the mangle. Apply the changes to create the cluster per the tutorial.
|
||||
|
||||
```
|
||||
$ terraform apply
|
||||
```
|
||||
|
||||
Container Linux Configs (and the CoreOS Ignition system) create immutable infrastructure. Disk provisioning is performed only on first boot from disk. That means if you change a snippet used by an instance, Terraform will (correctly) try to destroy and recreate that instance. Be careful!
|
||||
|
||||
!!! danger
|
||||
Destroying and recreating controller instances is destructive! etcd runs on controller instances and stores data there. Do not modify controller snippets. See [blue/green](/topics/maintenance/#upgrades) clusters.
|
||||
|
||||
## Architecture
|
||||
|
||||
Typhoon chooses variables to expose with purpose. If you must customize clusters in ways that aren't supported by input variables, fork Typhoon and maintain a repository with customizations. Reference the repository by changing the username.
|
||||
|
@ -79,7 +79,7 @@ Create a cluster following the Azure [tutorial](../cl/azure.md#cluster). Define
|
||||
|
||||
```tf
|
||||
module "ramius-worker-pool" {
|
||||
source = "git::https://github.com/poseidon/typhoon//azure/container-linux/kubernetes/workers?ref=v1.17.0"
|
||||
source = "git::https://github.com/poseidon/typhoon//azure/container-linux/kubernetes/workers?ref=v1.18.0"
|
||||
|
||||
# Azure
|
||||
region = module.ramius.region
|
||||
@ -145,7 +145,7 @@ Create a cluster following the Google Cloud [tutorial](../cl/google-cloud.md#clu
|
||||
|
||||
```tf
|
||||
module "yavin-worker-pool" {
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes/workers?ref=v1.17.0"
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes/workers?ref=v1.18.0"
|
||||
|
||||
# Google Cloud
|
||||
region = "europe-west2"
|
||||
@ -176,11 +176,11 @@ Verify a managed instance group of workers joins the cluster within a few minute
|
||||
```
|
||||
$ kubectl get nodes
|
||||
NAME STATUS AGE VERSION
|
||||
yavin-controller-0.c.example-com.internal Ready 6m v1.17.0
|
||||
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.17.0
|
||||
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.17.0
|
||||
yavin-16x-worker-jrbf.c.example-com.internal Ready 3m v1.17.0
|
||||
yavin-16x-worker-mzdm.c.example-com.internal Ready 3m v1.17.0
|
||||
yavin-controller-0.c.example-com.internal Ready 6m v1.18.0
|
||||
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.18.0
|
||||
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.18.0
|
||||
yavin-16x-worker-jrbf.c.example-com.internal Ready 3m v1.18.0
|
||||
yavin-16x-worker-mzdm.c.example-com.internal Ready 3m v1.18.0
|
||||
```
|
||||
|
||||
### Variables
|
||||
|
@ -1,5 +1,15 @@
|
||||
# Announce <img align="right" src="https://storage.googleapis.com/poseidon/typhoon-logo-small.png">
|
||||
|
||||
## Jan 23, 2020
|
||||
|
||||
Typhoon for Fedora CoreOS promoted to alpha!
|
||||
|
||||
Last summer, Typhoon released the first preview of Kubernetes on Fedora CoreOS for bare-metal and AWS, developing many ideas and patterns from Typhoon for Container Linux and Fedora Atomic. Since then, Typhoon for Fedora CoreOS has evolved and gained features alongside Typhoon, while Fedora CoreOS itself has evolved and improved too.
|
||||
|
||||
Fedora recently [announced](https://fedoramagazine.org/fedora-coreos-out-of-preview/) that Fedora CoreOS is available for general use. To align with that change and to better indicate the maturing status, Typhoon for Fedora CoreOS has been promoted to alpha. Many thanks to folks who have worked to make this possbile!
|
||||
|
||||
About: For newcomers, Typhoon is a minimal and free (cost and freedom) Kubernetes distribution providing upstream Kubernetes, declarative configuration via Terraform, and support for AWS, Azure, Google Cloud, DigitalOcean, and bare-metal. It is run by former CoreOS engineer [@dghubble](https://twitter.com/dghubble) to power his clusters, with freedom [motivations](https://typhoon.psdn.io/#motivation).
|
||||
|
||||
## Jul 18, 2019
|
||||
|
||||
Introducing a preview of Typhoon Kubernetes clusters with Fedora CoreOS!
|
||||
@ -8,8 +18,6 @@ Fedora recently [announced](https://lists.fedoraproject.org/archives/list/coreos
|
||||
|
||||
While Typhoon uses Container Linux (or Flatcar Linux) for stable modules, the project hasn't been a stranger to Fedora ideas, once developing a [Fedora Atomic](https://typhoon.psdn.io/announce/#april-26-2018) variant in 2018. That makes the Fedora CoreOS fushion both exciting and familiar. Typhoon with Fedora CoreOS uses Ignition v3 for provisioning, uses rpm-ostree for layering and updates, tries swapping system containers for podman, and brings SELinux enforcement ([table](https://typhoon.psdn.io/architecture/operating-systems/)). This is an early preview (don't go to prod), but do try it out and help identify and solve issues (getting started links above).
|
||||
|
||||
About: For newcomers, Typhoon is a minimal and free (cost and freedom) Kubernetes distribution providing upstream Kubernetes, declarative configuration via Terraform, and support for AWS, Azure, Google Cloud, DigitalOcean, and bare-metal. It is run by former CoreOS engineer [@dghubble](https://twitter.com/dghubble) to power his clusters with freedom [motivations](https://typhoon.psdn.io/#motivation).
|
||||
|
||||
## March 27, 2019
|
||||
|
||||
Last April, Typhoon [introduced](#april-26-2018) alpha support for creating Kubernetes clusters with Fedora Atomic on AWS, Google Cloud, DigitalOcean, and bare-metal. Fedora Atomic shared many of Container Linux's aims for a container-optimized operating system, introduced novel ideas, and provided technical diversification for an uncertain future. However, Project Atomic efforts were merged into Fedora CoreOS and future Fedora Atomic releases are [not expected](http://www.projectatomic.io/blog/2018/06/welcome-to-fedora-coreos/). *Typhoon modules for Fedora Atomic will not be updated much beyond Kubernetes v1.13*. They may later be removed.
|
||||
@ -46,7 +54,7 @@ Typhoon for Fedora Atomic reflects many of the same principles that created Typh
|
||||
|
||||
Meanwhile, Fedora Atomic adds some promising new low-level technologies:
|
||||
|
||||
* [ostree](https://github.com/ostreedev/ostree) & [rpm-ostree](https://github.com/projectatomic/rpm-ostree) - a hybrid, layered, image and package system that lets you perform atomic updates and rollbacks, layer on packages, "rebase" your system, or manage a remote tree repo. See Dusty Mabe's great [intro](https://dustymabe.com/2017/09/01/atomic-host-101-lab-part-3-rebase-upgrade-rollback/).
|
||||
* [ostree](https://github.com/ostreedev/ostree) & [rpm-ostree](https://github.com/projectatomic/rpm-ostree) - a hybrid, layered, image and package system that lets you perform atomic updates and rollbacks, layer on packages, "rebase" your system, or manage a remote tree repo. See Dusty Mabe's great [intro](https://dustymabe.com/2017/09/01/atomic-host-101-lab-part-3-rebase-upgrade-rollback/).
|
||||
|
||||
* [system containers](http://www.projectatomic.io/blog/2016/09/intro-to-system-containers/) - OCI container images that embed systemd and runc metadata for starting low-level host services before container runtimes are ready. Typhoon uses system containers under runc for `etcd`, `kubelet`, and `bootkube` on Fedora Atomic (instead of rkt-fly).
|
||||
|
||||
|
@ -79,6 +79,23 @@ resource "aws_security_group_rule" "some-app" {
|
||||
}
|
||||
```
|
||||
|
||||
## Routes
|
||||
|
||||
Add a custom [route](https://www.terraform.io/docs/providers/aws/r/route.html) to the VPC route table.
|
||||
|
||||
```tf
|
||||
data "aws_route_table" "default" {
|
||||
vpc_id = module.temptest.vpc_id
|
||||
subnet_id = module.tempest.subnet_ids[0]
|
||||
}
|
||||
|
||||
resource "aws_route" "peering" {
|
||||
route_table_id = data.aws_route_table.default.id
|
||||
destination_cidr_block = "192.168.4.0/24"
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
## IPv6
|
||||
|
||||
AWS Network Load Balancers do not support `dualstack`.
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Operating Systems
|
||||
|
||||
Typhoon supports [Container Linux](https://coreos.com/why/), [Flatcar Linux](https://www.flatcar-linux.org/) and [Fedora CoreOS](https://getfedora.org/coreos/) (preview). These operating systems were chosen because they offer:
|
||||
Typhoon supports [Fedora CoreOS](https://getfedora.org/coreos/), [Flatcar Linux](https://www.flatcar-linux.org/) and Container Linux (EOL in May 2020). These operating systems were chosen because they offer:
|
||||
|
||||
* Minimalism and focus on clustered operation
|
||||
* Automated and atomic operating system upgrades
|
||||
@ -9,16 +9,18 @@ Typhoon supports [Container Linux](https://coreos.com/why/), [Flatcar Linux](htt
|
||||
|
||||
Together, they diversify Typhoon to support a range of container technologies.
|
||||
|
||||
* Container Linux: Gentoo core, rkt-fly, docker
|
||||
* Fedora CoreOS: rpm-ostree, podman, moby
|
||||
* Flatcar Linux: Gentoo core, rkt-fly, docker
|
||||
|
||||
## Host Properties
|
||||
|
||||
| Property | Container Linux / Flatcar Linux | Fedora CoreOS |
|
||||
| Property | Flatcar Linux | Fedora CoreOS |
|
||||
|-------------------|---------------------------------|---------------|
|
||||
| Kernel | ~4.19.x | ~5.5.x |
|
||||
| systemd | 241 | 243 |
|
||||
| Ignition system | Ignition v2.x spec | Ignition v3.x spec |
|
||||
| Container Engine | docker | docker |
|
||||
| storage driver | overlay2 | overlay2 |
|
||||
| Container Engine | docker 18.06.3-ce | docker 18.09.8 |
|
||||
| storage driver | overlay2 (extfs) | overlay2 (xfs) |
|
||||
| logging driver | json-file | journald |
|
||||
| cgroup driver | cgroupfs (except Flatcar edge) | systemd |
|
||||
| Networking | systemd-networkd | NetworkManager |
|
||||
@ -26,13 +28,13 @@ Together, they diversify Typhoon to support a range of container technologies.
|
||||
|
||||
## Kubernetes Properties
|
||||
|
||||
| Property | Container Linux | Fedora CoreOS |
|
||||
| Property | Flatcar Linux | Fedora CoreOS |
|
||||
|-------------------|-----------------|---------------|
|
||||
| single-master | all platforms | all platforms |
|
||||
| multi-master | all platforms | all platforms |
|
||||
| control plane | static pods | static pods |
|
||||
| kubelet image | upstream hyperkube | upstream hyperkube |
|
||||
| control plane images | upstream hyperkube | upstream hyperkube |
|
||||
| kubelet image | kubelet [image](https://github.com/poseidon/kubelet) with upstream binary | kubelet [image](https://github.com/poseidon/kubelet) with upstream binary |
|
||||
| control plane images | upstream images | upstream images |
|
||||
| on-host etcd | rkt-fly | podman |
|
||||
| on-host kubelet | rkt-fly | podman |
|
||||
| CNI plugins | calico or flannel | calico or flannel |
|
||||
|
@ -1,6 +1,6 @@
|
||||
# AWS
|
||||
|
||||
In this tutorial, we'll create a Kubernetes v1.17.0 cluster on AWS with Container Linux.
|
||||
In this tutorial, we'll create a Kubernetes v1.18.0 cluster on AWS with CoreOS Container Linux or Flatcar Linux.
|
||||
|
||||
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a VPC, gateway, subnets, security groups, controller instances, worker auto-scaling group, network load balancer, and TLS assets.
|
||||
|
||||
@ -18,15 +18,15 @@ Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your sy
|
||||
|
||||
```sh
|
||||
$ terraform version
|
||||
Terraform v0.12.16
|
||||
Terraform v0.12.21
|
||||
```
|
||||
|
||||
Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
|
||||
|
||||
```sh
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.4.0/terraform-provider-ct-v0.4.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.4.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.4.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.4.0
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0
|
||||
```
|
||||
|
||||
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
|
||||
@ -49,13 +49,13 @@ Configure the AWS provider to use your access key credentials in a `providers.tf
|
||||
|
||||
```tf
|
||||
provider "aws" {
|
||||
version = "2.41.0"
|
||||
version = "2.53.0"
|
||||
region = "eu-central-1"
|
||||
shared_credentials_file = "/home/user/.config/aws/credentials"
|
||||
}
|
||||
|
||||
provider "ct" {
|
||||
version = "0.4.0"
|
||||
version = "0.5.0"
|
||||
}
|
||||
```
|
||||
|
||||
@ -70,7 +70,7 @@ Define a Kubernetes cluster using the module `aws/container-linux/kubernetes`.
|
||||
|
||||
```tf
|
||||
module "tempest" {
|
||||
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes?ref=v1.17.0"
|
||||
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes?ref=v1.18.0"
|
||||
|
||||
# AWS
|
||||
cluster_name = "tempest"
|
||||
@ -143,36 +143,33 @@ List nodes in the cluster.
|
||||
$ export KUBECONFIG=/home/user/.kube/configs/tempest-config
|
||||
$ kubectl get nodes
|
||||
NAME STATUS ROLES AGE VERSION
|
||||
ip-10-0-3-155 Ready <none> 10m v1.17.0
|
||||
ip-10-0-26-65 Ready <none> 10m v1.17.0
|
||||
ip-10-0-41-21 Ready <none> 10m v1.17.0
|
||||
ip-10-0-3-155 Ready <none> 10m v1.18.0
|
||||
ip-10-0-26-65 Ready <none> 10m v1.18.0
|
||||
ip-10-0-41-21 Ready <none> 10m v1.18.0
|
||||
```
|
||||
|
||||
List the pods.
|
||||
|
||||
```
|
||||
$ kubectl get pods --all-namespaces
|
||||
NAMESPACE NAME READY STATUS RESTARTS AGE
|
||||
kube-system calico-node-1m5bf 2/2 Running 0 34m
|
||||
kube-system calico-node-7jmr1 2/2 Running 0 34m
|
||||
kube-system calico-node-bknc8 2/2 Running 0 34m
|
||||
kube-system coredns-1187388186-wx1lg 1/1 Running 0 34m
|
||||
NAMESPACE NAME READY STATUS RESTARTS AGE
|
||||
kube-system calico-node-1m5bf 2/2 Running 0 34m
|
||||
kube-system calico-node-7jmr1 2/2 Running 0 34m
|
||||
kube-system calico-node-bknc8 2/2 Running 0 34m
|
||||
kube-system coredns-1187388186-wx1lg 1/1 Running 0 34m
|
||||
kube-system coredns-1187388186-qjnvp 1/1 Running 0 34m
|
||||
kube-system kube-apiserver-ip-10-0-3-155 1/1 Running 0 34m
|
||||
kube-system kube-controller-manager-ip-10-0-3-155 1/1 Running 0 34m
|
||||
kube-system kube-proxy-14wxv 1/1 Running 0 34m
|
||||
kube-system kube-proxy-9vxh2 1/1 Running 0 34m
|
||||
kube-system kube-proxy-sbbsh 1/1 Running 0 34m
|
||||
kube-system kube-scheduler-ip-10-0-3-155 1/1 Running 1 34m
|
||||
kube-system kube-apiserver-ip-10-0-3-155 1/1 Running 0 34m
|
||||
kube-system kube-controller-manager-ip-10-0-3-155 1/1 Running 0 34m
|
||||
kube-system kube-proxy-14wxv 1/1 Running 0 34m
|
||||
kube-system kube-proxy-9vxh2 1/1 Running 0 34m
|
||||
kube-system kube-proxy-sbbsh 1/1 Running 0 34m
|
||||
kube-system kube-scheduler-ip-10-0-3-155 1/1 Running 1 34m
|
||||
```
|
||||
|
||||
## Going Further
|
||||
|
||||
Learn about [maintenance](/topics/maintenance/) and [addons](/addons/overview/).
|
||||
|
||||
!!! note
|
||||
On Container Linux clusters, install the `CLUO` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
|
||||
|
||||
## Variables
|
||||
|
||||
Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/aws/container-linux/kubernetes/variables.tf) source.
|
||||
@ -207,7 +204,6 @@ Reference the DNS zone id with `aws_route53_zone.zone-for-clusters.zone_id`.
|
||||
|
||||
| Name | Description | Default | Example |
|
||||
|:-----|:------------|:--------|:--------|
|
||||
| asset_dir | Absolute path to a directory where generated assets should be placed (contains secrets) | "" (disabled) | "/home/user/.secrets/clusters/tempest" |
|
||||
| controller_count | Number of controllers (i.e. masters) | 1 | 1 |
|
||||
| worker_count | Number of workers | 1 | 3 |
|
||||
| controller_type | EC2 instance type for controllers | "t3.small" | See below |
|
||||
|
@ -3,7 +3,7 @@
|
||||
!!! danger
|
||||
Typhoon for Azure is alpha. For production, use AWS, Google Cloud, or bare-metal. As Azure matures, check [errata](https://github.com/poseidon/typhoon/wiki/Errata) for known shortcomings.
|
||||
|
||||
In this tutorial, we'll create a Kubernetes v1.17.0 cluster on Azure with Container Linux.
|
||||
In this tutorial, we'll create a Kubernetes v1.18.0 cluster on Azure with CoreOS Container Linux or Flatcar Linux.
|
||||
|
||||
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a resource group, virtual network, subnets, security groups, controller availability set, worker scale set, load balancer, and TLS assets.
|
||||
|
||||
@ -21,15 +21,15 @@ Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your sy
|
||||
|
||||
```sh
|
||||
$ terraform version
|
||||
Terraform v0.12.16
|
||||
Terraform v0.12.21
|
||||
```
|
||||
|
||||
Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
|
||||
|
||||
```sh
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.4.0/terraform-provider-ct-v0.4.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.4.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.4.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.4.0
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0
|
||||
```
|
||||
|
||||
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
|
||||
@ -50,11 +50,11 @@ Configure the Azure provider in a `providers.tf` file.
|
||||
|
||||
```tf
|
||||
provider "azurerm" {
|
||||
version = "1.38.0"
|
||||
version = "2.1.0"
|
||||
}
|
||||
|
||||
provider "ct" {
|
||||
version = "0.4.0"
|
||||
version = "0.5.0"
|
||||
}
|
||||
```
|
||||
|
||||
@ -66,7 +66,7 @@ Define a Kubernetes cluster using the module `azure/container-linux/kubernetes`.
|
||||
|
||||
```tf
|
||||
module "ramius" {
|
||||
source = "git::https://github.com/poseidon/typhoon//azure/container-linux/kubernetes?ref=v1.17.0"
|
||||
source = "git::https://github.com/poseidon/typhoon//azure/container-linux/kubernetes?ref=v1.18.0"
|
||||
|
||||
# Azure
|
||||
cluster_name = "ramius"
|
||||
@ -85,6 +85,15 @@ module "ramius" {
|
||||
|
||||
Reference the [variables docs](#variables) or the [variables.tf](https://github.com/poseidon/typhoon/blob/master/azure/container-linux/kubernetes/variables.tf) source.
|
||||
|
||||
### Flatcar Linux Only
|
||||
|
||||
Flatcar Linux publishes images to the Azure Marketplace and requires accepting their legal terms.
|
||||
|
||||
```
|
||||
az vm image terms show --publish kinvolk --offer flatcar-container-linux --plan stable
|
||||
az vm image terms accept --publish kinvolk --offer flatcar-container-linux --plan stable
|
||||
```
|
||||
|
||||
## ssh-agent
|
||||
|
||||
Initial bootstrapping requires `bootstrap.service` be started on one controller node. Terraform uses `ssh-agent` to automate this step. Add your SSH private key to `ssh-agent`.
|
||||
@ -140,9 +149,9 @@ List nodes in the cluster.
|
||||
$ export KUBECONFIG=/home/user/.kube/configs/ramius-config
|
||||
$ kubectl get nodes
|
||||
NAME STATUS ROLES AGE VERSION
|
||||
ramius-controller-0 Ready <none> 24m v1.17.0
|
||||
ramius-worker-000001 Ready <none> 25m v1.17.0
|
||||
ramius-worker-000002 Ready <none> 24m v1.17.0
|
||||
ramius-controller-0 Ready <none> 24m v1.18.0
|
||||
ramius-worker-000001 Ready <none> 25m v1.18.0
|
||||
ramius-worker-000002 Ready <none> 24m v1.18.0
|
||||
```
|
||||
|
||||
List the pods.
|
||||
@ -152,9 +161,9 @@ $ kubectl get pods --all-namespaces
|
||||
NAMESPACE NAME READY STATUS RESTARTS AGE
|
||||
kube-system coredns-7c6fbb4f4b-b6qzx 1/1 Running 0 26m
|
||||
kube-system coredns-7c6fbb4f4b-j2k3d 1/1 Running 0 26m
|
||||
kube-system calico-node-1m5bf 2/2 Running 0 26m
|
||||
kube-system calico-node-7jmr1 2/2 Running 0 26m
|
||||
kube-system calico-node-bknc8 2/2 Running 0 26m
|
||||
kube-system calico-node-1m5bf 2/2 Running 0 26m
|
||||
kube-system calico-node-7jmr1 2/2 Running 0 26m
|
||||
kube-system calico-node-bknc8 2/2 Running 0 26m
|
||||
kube-system kube-apiserver-ramius-controller-0 1/1 Running 0 26m
|
||||
kube-system kube-controller-manager-ramius-controller-0 1/1 Running 0 26m
|
||||
kube-system kube-proxy-j4vpq 1/1 Running 0 26m
|
||||
@ -167,9 +176,6 @@ kube-system kube-scheduler-ramius-controller-0 1/1 Running 0
|
||||
|
||||
Learn about [maintenance](/topics/maintenance/) and [addons](/addons/overview/).
|
||||
|
||||
!!! note
|
||||
On Container Linux clusters, install the `CLUO` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
|
||||
|
||||
## Variables
|
||||
|
||||
Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/azure/container-linux/kubernetes/variables.tf) source.
|
||||
@ -218,14 +224,13 @@ Reference the DNS zone with `azurerm_dns_zone.clusters.name` and its resource gr
|
||||
|
||||
| Name | Description | Default | Example |
|
||||
|:-----|:------------|:--------|:--------|
|
||||
| asset_dir | Absolute path to a directory where generated assets should be placed (contains secrets) | "" (disabled) | "/home/user/.secrets/clusters/ramius" |
|
||||
| controller_count | Number of controllers (i.e. masters) | 1 | 1 |
|
||||
| worker_count | Number of workers | 1 | 3 |
|
||||
| controller_type | Machine type for controllers | "Standard_B2s" | See below |
|
||||
| worker_type | Machine type for workers | "Standard_DS1_v2" | See below |
|
||||
| os_image | Channel for a Container Linux derivative | "coreos-stable" | coreos-stable, coreos-beta, coreos-alpha |
|
||||
| os_image | Channel for a Container Linux derivative | "coreos-stable" | coreos-stable, coreos-beta, coreos-alpha, flatcar-stable, flatcar-beta |
|
||||
| disk_size | Size of the disk in GB | 40 | 100 |
|
||||
| worker_priority | Set priority to Low to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time | Regular | Low |
|
||||
| worker_priority | Set priority to Spot to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time | Regular | Spot |
|
||||
| controller_clc_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/#usage) |
|
||||
| worker_clc_snippets | Worker Container Linux Config snippets | [] | [example](/advanced/customization/#usage) |
|
||||
| networking | Choice of networking provider | "calico" | "flannel" or "calico" |
|
||||
@ -242,6 +247,6 @@ Check the list of valid [machine types](https://azure.microsoft.com/en-us/pricin
|
||||
!!! warning
|
||||
Do not choose a `controller_type` smaller than `Standard_B2s`. Smaller instances are not sufficient for running a controller.
|
||||
|
||||
#### Low Priority
|
||||
#### Spot Priority
|
||||
|
||||
Add `worker_priority=Low` to use [Low Priority](https://docs.microsoft.com/en-us/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-use-low-priority) workers that run on Azure's surplus capacity at lower cost, but with the tradeoff that they can be deallocated at random. Low priority VMs are Azure's analog to AWS spot instances or GCP premptible instances.
|
||||
Add `worker_priority=Spot` to use [Spot Priority](https://docs.microsoft.com/en-us/azure/virtual-machines/linux/spot-vms) workers that run on Azure's surplus capacity at lower cost, but with the tradeoff that they can be deallocated at random. Spot priority VMs are Azure's analog to AWS spot instances or GCP premptible instances.
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Bare-Metal
|
||||
|
||||
In this tutorial, we'll network boot and provision a Kubernetes v1.17.0 cluster on bare-metal with Container Linux.
|
||||
In this tutorial, we'll network boot and provision a Kubernetes v1.18.0 cluster on bare-metal with CoreOS Container Linux or Flatcar Linux.
|
||||
|
||||
First, we'll deploy a [Matchbox](https://github.com/poseidon/matchbox) service and setup a network boot environment. Then, we'll declare a Kubernetes cluster using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Container Linux to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers via Ignition.
|
||||
|
||||
@ -27,7 +27,7 @@ Configure each machine to boot from the disk through IPMI or the BIOS menu.
|
||||
```
|
||||
ipmitool -H node1 -U USER -P PASS chassis bootdev disk options=persistent
|
||||
```
|
||||
|
||||
|
||||
During provisioning, you'll explicitly set the boot device to `pxe` for the next boot only. Machines will install (overwrite) the operating system to disk on PXE boot and reboot into the disk install.
|
||||
|
||||
!!! tip ""
|
||||
@ -111,7 +111,7 @@ Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your sy
|
||||
|
||||
```sh
|
||||
$ terraform version
|
||||
Terraform v0.12.16
|
||||
Terraform v0.12.21
|
||||
```
|
||||
|
||||
Add the [terraform-provider-matchbox](https://github.com/poseidon/terraform-provider-matchbox) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
|
||||
@ -125,9 +125,9 @@ mv terraform-provider-matchbox-v0.3.0-linux-amd64/terraform-provider-matchbox ~/
|
||||
Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
|
||||
|
||||
```sh
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.4.0/terraform-provider-ct-v0.4.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.4.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.4.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.4.0
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0
|
||||
```
|
||||
|
||||
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
|
||||
@ -150,7 +150,7 @@ provider "matchbox" {
|
||||
}
|
||||
|
||||
provider "ct" {
|
||||
version = "0.4.0"
|
||||
version = "0.5.0"
|
||||
}
|
||||
```
|
||||
|
||||
@ -160,8 +160,8 @@ Define a Kubernetes cluster using the module `bare-metal/container-linux/kuberne
|
||||
|
||||
```tf
|
||||
module "mercury" {
|
||||
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.17.0"
|
||||
|
||||
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.18.0"
|
||||
|
||||
# bare-metal
|
||||
cluster_name = "mercury"
|
||||
matchbox_http_endpoint = "http://matchbox.example.com"
|
||||
@ -264,9 +264,9 @@ Apply complete! Resources: 55 added, 0 changed, 0 destroyed.
|
||||
To watch the install to disk (until machines reboot from disk), SSH to port 2222.
|
||||
|
||||
```
|
||||
# before v1.17.0
|
||||
# before v1.10.1
|
||||
$ ssh debug@node1.example.com
|
||||
# after v1.17.0
|
||||
# after v1.10.1
|
||||
$ ssh -p 2222 core@node1.example.com
|
||||
```
|
||||
|
||||
@ -299,9 +299,9 @@ List nodes in the cluster.
|
||||
$ export KUBECONFIG=/home/user/.kube/configs/mercury-config
|
||||
$ kubectl get nodes
|
||||
NAME STATUS ROLES AGE VERSION
|
||||
node1.example.com Ready <none> 10m v1.17.0
|
||||
node2.example.com Ready <none> 10m v1.17.0
|
||||
node3.example.com Ready <none> 10m v1.17.0
|
||||
node1.example.com Ready <none> 10m v1.18.0
|
||||
node2.example.com Ready <none> 10m v1.18.0
|
||||
node3.example.com Ready <none> 10m v1.18.0
|
||||
```
|
||||
|
||||
List the pods.
|
||||
@ -326,9 +326,6 @@ kube-system kube-scheduler-node1.example.com 1/1 Running 0
|
||||
|
||||
Learn about [maintenance](/topics/maintenance/) and [addons](/addons/overview/).
|
||||
|
||||
!!! note
|
||||
On Container Linux clusters, install the `CLUO` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
|
||||
|
||||
## Variables
|
||||
|
||||
Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/bare-metal/container-linux/kubernetes/variables.tf) source.
|
||||
@ -350,15 +347,16 @@ Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/bare-me
|
||||
|
||||
| Name | Description | Default | Example |
|
||||
|:-----|:------------|:--------|:--------|
|
||||
| asset_dir | Absolute path to a directory where generated assets should be placed (contains secrets) | "" (disabled) | "/home/user/.secrets/clusters/mercury" |
|
||||
| download_protocol | Protocol iPXE uses to download the kernel and initrd. iPXE must be compiled with [crypto](https://ipxe.org/crypto) support for https. Unused if cached_install is true | "https" | "http" |
|
||||
| cached_install | PXE boot and install from the Matchbox `/assets` cache. Admin MUST have downloaded Container Linux or Flatcar images into the cache | false | true |
|
||||
| install_disk | Disk device where Container Linux should be installed | "/dev/sda" | "/dev/sdb" |
|
||||
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
|
||||
| network_mtu | CNI interface MTU (calico-only) | 1480 | - |
|
||||
| network_mtu | CNI interface MTU (calico-only) | 1480 | - |
|
||||
| clc_snippets | Map from machine names to lists of Container Linux Config snippets | {} | [example](/advanced/customization/#usage) |
|
||||
| network_ip_autodetection_method | Method to detect host IPv4 address (calico-only) | "first-found" | "can-reach=10.0.0.1" |
|
||||
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
|
||||
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
|
||||
| kernel_args | Additional kernel args to provide at PXE boot | [] | ["kvm-intel.nested=1"] |
|
||||
| worker_node_labels | Map from worker name to list of initial node labels | {} | {"node2" = ["role=special"]} |
|
||||
| worker_node_taints | Map from worker name to list of initial node taints | {} | {"node2" = ["role=special:NoSchedule"]} |
|
||||
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Digital Ocean
|
||||
|
||||
In this tutorial, we'll create a Kubernetes v1.17.0 cluster on DigitalOcean with Container Linux.
|
||||
In this tutorial, we'll create a Kubernetes v1.18.0 cluster on DigitalOcean with CoreOS Container Linux or Flatcar Linux.
|
||||
|
||||
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create controller droplets, worker droplets, DNS records, tags, and TLS assets.
|
||||
|
||||
@ -18,15 +18,15 @@ Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your sy
|
||||
|
||||
```sh
|
||||
$ terraform version
|
||||
Terraform v0.12.16
|
||||
Terraform v0.12.21
|
||||
```
|
||||
|
||||
Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
|
||||
|
||||
```sh
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.4.0/terraform-provider-ct-v0.4.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.4.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.4.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.4.0
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0
|
||||
```
|
||||
|
||||
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
|
||||
@ -50,12 +50,12 @@ Configure the DigitalOcean provider to use your token in a `providers.tf` file.
|
||||
|
||||
```tf
|
||||
provider "digitalocean" {
|
||||
version = "1.11.0"
|
||||
version = "1.14.0"
|
||||
token = "${chomp(file("~/.config/digital-ocean/token"))}"
|
||||
}
|
||||
|
||||
provider "ct" {
|
||||
version = "0.4.0"
|
||||
version = "0.5.0"
|
||||
}
|
||||
```
|
||||
|
||||
@ -65,16 +65,17 @@ Define a Kubernetes cluster using the module `digital-ocean/container-linux/kube
|
||||
|
||||
```tf
|
||||
module "nemo" {
|
||||
source = "git::https://github.com/poseidon/typhoon//digital-ocean/container-linux/kubernetes?ref=v1.17.0"
|
||||
source = "git::https://github.com/poseidon/typhoon//digital-ocean/container-linux/kubernetes?ref=v1.18.0"
|
||||
|
||||
# Digital Ocean
|
||||
cluster_name = "nemo"
|
||||
region = "nyc3"
|
||||
dns_zone = "digital-ocean.example.com"
|
||||
os_image = "coreos-stable"
|
||||
|
||||
# configuration
|
||||
ssh_fingerprints = ["d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7"]
|
||||
|
||||
|
||||
# optional
|
||||
worker_count = 2
|
||||
}
|
||||
@ -82,6 +83,28 @@ module "nemo" {
|
||||
|
||||
Reference the [variables docs](#variables) or the [variables.tf](https://github.com/poseidon/typhoon/blob/master/digital-ocean/container-linux/kubernetes/variables.tf) source.
|
||||
|
||||
### Flatcar Linux Only
|
||||
|
||||
!!! warning
|
||||
Typhoon for Flatcar Linux on DigitalOcean is alpha. Also IPv6 is unsupported with DigitalOcean custom images.
|
||||
|
||||
Flatcar Linux publishes DigitalOcean images, but does not upload them. DigitalOcean allows [custom boot images](https://blog.digitalocean.com/custom-images/) by file or URL.
|
||||
|
||||
[Download](https://www.flatcar-linux.org/releases/) the Flatcar Linux DigitalOcean bin image (or copy the URL) and [upload](https://cloud.digitalocean.com/images/custom_images) it as a custom image. Rename the image with the channel and version to refer to these images over time.
|
||||
|
||||
```tf
|
||||
module "nemo" {
|
||||
...
|
||||
os_image = data.digitalocean_image.flatcar-stable.id
|
||||
}
|
||||
|
||||
data "digitalocean_image" "flatcar-stable" {
|
||||
name = "flatcar-stable-2303.4.0.bin.bz2"
|
||||
}
|
||||
```
|
||||
|
||||
Set the [os_image](#variables) to the custom image id.
|
||||
|
||||
## ssh-agent
|
||||
|
||||
Initial bootstrapping requires `bootstrap.service` be started on one controller node. Terraform uses `ssh-agent` to automate this step. Add your SSH private key to `ssh-agent`.
|
||||
@ -138,9 +161,9 @@ List nodes in the cluster.
|
||||
$ export KUBECONFIG=/home/user/.kube/configs/nemo-config
|
||||
$ kubectl get nodes
|
||||
NAME STATUS ROLES AGE VERSION
|
||||
10.132.110.130 Ready <none> 10m v1.17.0
|
||||
10.132.115.81 Ready <none> 10m v1.17.0
|
||||
10.132.124.107 Ready <none> 10m v1.17.0
|
||||
10.132.110.130 Ready <none> 10m v1.18.0
|
||||
10.132.115.81 Ready <none> 10m v1.18.0
|
||||
10.132.124.107 Ready <none> 10m v1.18.0
|
||||
```
|
||||
|
||||
List the pods.
|
||||
@ -149,9 +172,9 @@ List the pods.
|
||||
NAMESPACE NAME READY STATUS RESTARTS AGE
|
||||
kube-system coredns-1187388186-ld1j7 1/1 Running 0 11m
|
||||
kube-system coredns-1187388186-rdhf7 1/1 Running 0 11m
|
||||
kube-system calico-node-1m5bf 2/2 Running 0 11m
|
||||
kube-system calico-node-7jmr1 2/2 Running 0 11m
|
||||
kube-system calico-node-bknc8 2/2 Running 0 11m
|
||||
kube-system calico-node-1m5bf 2/2 Running 0 11m
|
||||
kube-system calico-node-7jmr1 2/2 Running 0 11m
|
||||
kube-system calico-node-bknc8 2/2 Running 0 11m
|
||||
kube-system kube-apiserver-ip-10.132.115.81 1/1 Running 0 11m
|
||||
kube-system kube-controller-manager-ip-10.132.115.81 1/1 Running 0 11m
|
||||
kube-system kube-proxy-6kxjf 1/1 Running 0 11m
|
||||
@ -164,9 +187,6 @@ kube-system kube-scheduler-ip-10.132.115.81 1/1 Running 0
|
||||
|
||||
Learn about [maintenance](/topics/maintenance/) and [addons](/addons/overview/).
|
||||
|
||||
!!! note
|
||||
On Container Linux clusters, install the `CLUO` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
|
||||
|
||||
## Variables
|
||||
|
||||
Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/digital-ocean/container-linux/kubernetes/variables.tf) source.
|
||||
@ -219,12 +239,11 @@ Digital Ocean requires the SSH public key be uploaded to your account, so you ma
|
||||
|
||||
| Name | Description | Default | Example |
|
||||
|:-----|:------------|:--------|:--------|
|
||||
| asset_dir | Absolute path to a directory where generated assets should be placed (contains secrets) | "" (disabled) | "/home/user/.secrets/nemo" |
|
||||
| controller_count | Number of controllers (i.e. masters) | 1 | 1 |
|
||||
| worker_count | Number of workers | 1 | 3 |
|
||||
| controller_type | Droplet type for controllers | "s-2vcpu-2gb" | s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb, ... |
|
||||
| worker_type | Droplet type for workers | "s-1vcpu-2gb" | s-1vcpu-2gb, s-2vcpu-2gb, ... |
|
||||
| image | Container Linux image for instances | "coreos-stable" | coreos-stable, coreos-beta, coreos-alpha |
|
||||
| os_image | Container Linux image for instances | "coreos-stable" | coreos-stable, coreos-beta, coreos-alpha, "custom-image-id" |
|
||||
| controller_clc_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/) |
|
||||
| worker_clc_snippets | Worker Container Linux Config snippets | [] | [example](/advanced/customization/) |
|
||||
| networking | Choice of networking provider | "calico" | "flannel" or "calico" |
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Google Cloud
|
||||
|
||||
In this tutorial, we'll create a Kubernetes v1.17.0 cluster on Google Compute Engine with Container Linux.
|
||||
In this tutorial, we'll create a Kubernetes v1.18.0 cluster on Google Compute Engine with CoreOS Container Linux or Flatcar Linux.
|
||||
|
||||
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a network, firewall rules, health checks, controller instances, worker managed instance group, load balancers, and TLS assets.
|
||||
|
||||
@ -18,15 +18,15 @@ Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your sy
|
||||
|
||||
```sh
|
||||
$ terraform version
|
||||
Terraform v0.12.16
|
||||
Terraform v0.12.21
|
||||
```
|
||||
|
||||
Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
|
||||
|
||||
```sh
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.4.0/terraform-provider-ct-v0.4.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.4.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.4.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.4.0
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0
|
||||
```
|
||||
|
||||
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
|
||||
@ -49,14 +49,14 @@ Configure the Google Cloud provider to use your service account key, project-id,
|
||||
|
||||
```tf
|
||||
provider "google" {
|
||||
version = "2.20.0"
|
||||
version = "3.12.0"
|
||||
project = "project-id"
|
||||
region = "us-central1"
|
||||
credentials = file("~/.config/google-cloud/terraform.json")
|
||||
}
|
||||
|
||||
provider "ct" {
|
||||
version = "0.4.0"
|
||||
version = "0.5.0"
|
||||
}
|
||||
```
|
||||
|
||||
@ -71,7 +71,7 @@ Define a Kubernetes cluster using the module `google-cloud/container-linux/kuber
|
||||
|
||||
```tf
|
||||
module "yavin" {
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.17.0"
|
||||
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.18.0"
|
||||
|
||||
# Google Cloud
|
||||
cluster_name = "yavin"
|
||||
@ -81,7 +81,7 @@ module "yavin" {
|
||||
|
||||
# configuration
|
||||
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
|
||||
|
||||
|
||||
# optional
|
||||
worker_count = 2
|
||||
}
|
||||
@ -89,6 +89,28 @@ module "yavin" {
|
||||
|
||||
Reference the [variables docs](#variables) or the [variables.tf](https://github.com/poseidon/typhoon/blob/master/google-cloud/container-linux/kubernetes/variables.tf) source.
|
||||
|
||||
### Flatcar Linux Only
|
||||
|
||||
!!! warning
|
||||
Typhoon for Flatcar Linux on Google Cloud is alpha.
|
||||
|
||||
Flatcar Linux publishes Google Cloud images, but does not upload them. Google Cloud allows [custom boot images](https://cloud.google.com/compute/docs/images/import-existing-image) to be uploaded to a bucket and imported into a project.
|
||||
|
||||
[Download](https://www.flatcar-linux.org/releases/) the Flatcar Linux GCE gzipped tarball and upload it to a Google Cloud storage bucket.
|
||||
|
||||
```
|
||||
gsutil list
|
||||
gsutil cp flatcar_production_gce.tar.gz gs://BUCKET
|
||||
```
|
||||
|
||||
Create a Compute Engine image from the file.
|
||||
|
||||
```
|
||||
gcloud compute images create flatcar-linux-2303-4-0 --source-uri gs://BUCKET_NAME/flatcar_production_gce.tar.gz
|
||||
```
|
||||
|
||||
Set the [os_image](#variables) to the image name (e.g. `flatcar-linux-2303-4-0`)
|
||||
|
||||
## ssh-agent
|
||||
|
||||
Initial bootstrapping requires `bootstrap.service` be started on one controller node. Terraform uses `ssh-agent` to automate this step. Add your SSH private key to `ssh-agent`.
|
||||
@ -145,9 +167,9 @@ List nodes in the cluster.
|
||||
$ export KUBECONFIG=/home/user/.kube/configs/yavin-config
|
||||
$ kubectl get nodes
|
||||
NAME ROLES STATUS AGE VERSION
|
||||
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.17.0
|
||||
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.17.0
|
||||
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.17.0
|
||||
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.18.0
|
||||
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.18.0
|
||||
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.18.0
|
||||
```
|
||||
|
||||
List the pods.
|
||||
@ -172,9 +194,6 @@ kube-system kube-scheduler-controller-0 1/1 Running 0
|
||||
|
||||
Learn about [maintenance](/topics/maintenance/) and [addons](/addons/overview/).
|
||||
|
||||
!!! note
|
||||
On Container Linux clusters, install the `CLUO` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
|
||||
|
||||
## Variables
|
||||
|
||||
Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/google-cloud/container-linux/kubernetes/variables.tf) source.
|
||||
@ -212,12 +231,11 @@ resource "google_dns_managed_zone" "zone-for-clusters" {
|
||||
|
||||
| Name | Description | Default | Example |
|
||||
|:-----|:------------|:--------|:--------|
|
||||
| asset_dir | Absolute path to a directory where generated assets should be placed (contains secrets) | "" (disabled) | "/home/user/.secrets/clusters/yavin" |
|
||||
| controller_count | Number of controllers (i.e. masters) | 1 | 3 |
|
||||
| worker_count | Number of workers | 1 | 3 |
|
||||
| controller_type | Machine type for controllers | "n1-standard-1" | See below |
|
||||
| worker_type | Machine type for workers | "n1-standard-1" | See below |
|
||||
| os_image | Container Linux image for compute instances | "coreos-stable" | "coreos-stable-1632-3-0-v20180215" |
|
||||
| os_image | Container Linux image for compute instances | "coreos-stable" | "flatcar-linux-2303-4-0" |
|
||||
| disk_size | Size of the disk in GB | 40 | 100 |
|
||||
| worker_preemptible | If enabled, Compute Engine will terminate workers randomly within 24 hours | false | true |
|
||||
| controller_clc_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/) |
|
||||
|
@ -1,9 +1,6 @@
|
||||
# AWS
|
||||
|
||||
!!! danger
|
||||
Typhoon for Fedora CoreOS is an early preview! Fedora CoreOS itself is a preview! Expect bugs and design shifts. Please help both projects solve problems. Report Fedora CoreOS bugs to [Fedora](https://github.com/coreos/fedora-coreos-tracker/issues). Report Typhoon issues to Typhoon.
|
||||
|
||||
In this tutorial, we'll create a Kubernetes v1.17.0 cluster on AWS with Fedora CoreOS.
|
||||
In this tutorial, we'll create a Kubernetes v1.18.0 cluster on AWS with Fedora CoreOS.
|
||||
|
||||
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a VPC, gateway, subnets, security groups, controller instances, worker auto-scaling group, network load balancer, and TLS assets.
|
||||
|
||||
@ -21,15 +18,15 @@ Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your sy
|
||||
|
||||
```sh
|
||||
$ terraform version
|
||||
Terraform v0.12.16
|
||||
Terraform v0.12.21
|
||||
```
|
||||
|
||||
Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name.
|
||||
|
||||
```sh
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.4.0/terraform-provider-ct-v0.4.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.4.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.4.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.4.0
|
||||
wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz
|
||||
mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0
|
||||
```
|
||||
|
||||
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
|
||||
@ -52,13 +49,13 @@ Configure the AWS provider to use your access key credentials in a `providers.tf
|
||||
|
||||
```tf
|
||||
provider "aws" {
|
||||
version = "2.41.0"
|
||||
version = "2.53.0"
|
||||
region = "eu-central-1"
|
||||
shared_credentials_file = "/home/user/.config/aws/credentials"
|
||||
}
|
||||
|
||||
provider "ct" {
|
||||
version = "0.4.0"
|
||||
version = "0.5.0"
|
||||
}
|
||||
```
|
||||
|
||||
@ -73,7 +70,7 @@ Define a Kubernetes cluster using the module `aws/fedora-coreos/kubernetes`.
|
||||
|
||||
```tf
|
||||
module "tempest" {
|
||||
source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.17.0"
|
||||
source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.18.0"
|
||||
|
||||
# AWS
|
||||
cluster_name = "tempest"
|
||||
@ -146,9 +143,9 @@ List nodes in the cluster.
|
||||
$ export KUBECONFIG=/home/user/.kube/configs/tempest-config
|
||||
$ kubectl get nodes
|
||||
NAME STATUS ROLES AGE VERSION
|
||||
ip-10-0-3-155 Ready <none> 10m v1.17.0
|
||||
ip-10-0-26-65 Ready <none> 10m v1.17.0
|
||||
ip-10-0-41-21 Ready <none> 10m v1.17.0
|
||||
ip-10-0-3-155 Ready <none> 10m v1.18.0
|
||||
ip-10-0-26-65 Ready <none> 10m v1.18.0
|
||||
ip-10-0-41-21 Ready <none> 10m v1.18.0
|
||||
```
|
||||
|
||||
List the pods.
|
||||
@ -207,7 +204,6 @@ Reference the DNS zone id with `aws_route53_zone.zone-for-clusters.zone_id`.
|
||||
|
||||
| Name | Description | Default | Example |
|
||||
|:-----|:------------|:--------|:--------|
|
||||
| asset_dir | Absolute path to a directory where generated assets should be placed (contains secrets) | "" (disabled) | "/home/user/.secrets/clusters/tempest" |
|
||||
| controller_count | Number of controllers (i.e. masters) | 1 | 1 |
|
||||
| worker_count | Number of workers | 1 | 3 |
|
||||
| controller_type | EC2 instance type for controllers | "t3.small" | See below |
|
||||
@ -218,8 +214,8 @@ Reference the DNS zone id with `aws_route53_zone.zone-for-clusters.zone_id`.
|
||||
| disk_iops | IOPS of the EBS volume | 0 (i.e. auto) | 400 |
|
||||
| worker_target_groups | Target group ARNs to which worker instances should be added | [] | [aws_lb_target_group.app.id] |
|
||||
| worker_price | Spot price in USD for worker instances or 0 to use on-demand instances | 0 | 0.10 |
|
||||
| controller_snippets | Controller Fedora CoreOS Config snippets | [] | UNSUPPORTED |
|
||||
| worker_clc_snippets | Worker Fedora CoreOS Config snippets | [] | UNSUPPORTED |
|
||||
| controller_snippets | Controller Fedora CoreOS Config snippets | [] | [examples](/advanced/customization/) |
|
||||
| worker_snippets | Worker Fedora CoreOS Config snippets | [] | [examples](/advanced/customization/) |
|
||||
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
|
||||
| network_mtu | CNI interface MTU (calico only) | 1480 | 8981 |
|
||||
| host_cidr | CIDR IPv4 range to assign to EC2 instances | "10.0.0.0/16" | "10.1.0.0/16" |
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user