Compare commits

...

84 Commits

Author SHA1 Message Date
6df6bf904a Show Cilium as a CNI provider option in docs
* Start to show Cilium as a CNI option
* https://github.com/cilium/cilium
2020-07-18 13:27:56 -07:00
5fba20d358 Update recommended Terraform provider versions
* Sync Terraform provider plugin versions with those
used internally
2020-07-18 13:19:25 -07:00
a8d3d3bb12 Update ingress-nginx from v0.33.0 to v0.34.1
* Switch to ingress-nginx controller images from us.grc.io (eu, asia
can also be used if desired)
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.34.1
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.34.0
2020-07-15 22:43:49 -07:00
9ea6d2c245 Update Kubernetes from v1.18.5 to v1.18.6
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1186
* https://github.com/poseidon/terraform-render-bootstrap/pull/201
2020-07-15 22:05:57 -07:00
507aac9b78 Update mkdocs-material from v5.3.3 to v5.4.0 2020-07-11 22:56:59 -07:00
dfd2a0ec23 Update Grafana from v7.0.5 to v7.0.6
* https://github.com/grafana/grafana/releases/tag/v7.0.6
2020-07-09 21:10:48 -07:00
e3bf7d8f9b Update Prometheus from v2.19.1 to v2.19.2
* https://github.com/prometheus/prometheus/releases/tag/v2.19.2
2020-07-09 21:08:55 -07:00
49050320ce Update Cilium from v1.8.0 to v1.8.1
* https://github.com/cilium/cilium/releases/tag/v1.8.1
2020-07-05 16:00:00 -07:00
74e025c9e4 Update Grafana from v7.0.4 to v7.0.5
* https://github.com/grafana/grafana/releases/tag/v7.0.5
2020-07-05 15:49:34 -07:00
257a49ce37 Remove CoreOS Container Linux image names from docs
* Remove coreos-stable, coreos-beta, and coreos-alpha channel
references from docs
* CoreOS Container Linux is end of life (see changelog)
2020-06-30 01:36:53 -07:00
df3f40bcce Allow using Flatcar Linux edge on Azure
* Set Kubelet cgroup driver to systemd when Flatcar Linux edge
is chosen

Note: Typhoon module status assumes use of the stable variant of
an OS channel/stream. Its possible to use earlier variants and
those are sometimes tested or developed against, but stable is
the recommendation
2020-06-30 01:35:29 -07:00
32886cfba1 Promote Fedora CoreOS on Google Cloud to stable status 2020-06-29 23:09:11 -07:00
0ba2c1a4da Fix terraform fmt in firewall rules 2020-06-29 23:04:54 -07:00
430d139a5b Remove os_image variable on Google Cloud Fedora CoreOS
* In v1.18.3, the `os_stream` variable was added to select
a Fedora CoreOS image stream (stable, testing, next) on
AWS and Google Cloud (which publish official streams)
* Remove `os_image` variable deprecated in v1.18.3. Manually
uploaded images are no longer needed
2020-06-29 22:57:11 -07:00
7c6ab21b94 Isolate each DigitalOcean cluster in its own VPC
* DigitalOcean introduced Virtual Private Cloud (VPC) support
to match other clouds and enhance the prior "private networking"
feature. Before, droplet's belonging to different clusters (but
residing in the same region) could reach one another (although
Typhoon firewall rules prohibit this). Now, droplets in a VPC
reside in their own network
* https://www.digitalocean.com/docs/networking/vpc/
* Create droplet instances in a VPC per cluster. This matches the
design of Typhoon AWS, Azure, and GCP.
* Require `terraform-provider-digitalocean` v1.16.0+ (action required)
* Output `vpc_id` for use with an attached DigitalOcean
loadbalancer
2020-06-28 23:25:30 -07:00
21178868db Revert "Update Prometheus from v2.19.1 to v2.19.2"
* Prometheus has not published the v1.19.2
* This reverts commit 81b6f54169.
2020-06-27 14:53:58 -07:00
9dcf35e393 Update recommended Terraform provider versions
* Sync Terraform provider plugin versions with those
used internally
2020-06-27 14:44:18 -07:00
81b6f54169 Update Prometheus from v2.19.1 to v2.19.2
* https://github.com/prometheus/prometheus/releases/tag/v2.19.2
2020-06-27 14:34:30 -07:00
7bce15975c Update Kubernetes from v1.18.4 to v1.18.5
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1185
2020-06-27 13:52:18 -07:00
1f83ae7dbb Update Calico from v3.14.1 to v3.15.0
* https://docs.projectcalico.org/v3.15/release-notes/
2020-06-26 02:40:12 -07:00
a10a1cee9f Update mkdocs-material from v5.3.0 to v5.3.3 2020-06-26 02:24:37 -07:00
a79ad34ba3 Update Grafana from v7.0.3 to v7.0.4
* https://github.com/grafana/grafana/releases/tag/v7.0.4
2020-06-26 02:06:38 -07:00
99a11442c7 Update Prometheus from v2.19.0 to v2.19.1
* https://github.com/prometheus/prometheus/releases/tag/v2.19.1
2020-06-26 02:01:58 -07:00
d27f367004 Update Cilium from v1.8.0-rc4 to v1.8.0
* https://github.com/cilium/cilium/releases/tag/v1.8.0
2020-06-22 22:26:49 -07:00
e9c8520359 Add experimental Cilium CNI provider
* Accept experimental CNI `networking` mode "cilium"
* Run Cilium v1.8.0-rc4 with overlay vxlan tunnels and a
minimal set of features. We're interested in:
  * IPAM: Divide pod_cidr into /24 subnets per node
  * CNI networking pod-to-pod, pod-to-external
  * BPF masquerade
  * NetworkPolicy as defined by Kubernetes (no L7 Policy)
* Continue using kube-proxy with Cilium probe mode
* Firewall changes:
  * Require UDP 8472 for vxlan (Linux kernel default) between nodes
  * Optional ICMP echo(8) between nodes for host reachability
    (health)
  * Optional TCP 4240 between nodes for endpoint reachability (health)

Known Issues:

* Containers with `hostPort` don't listen on all host addresses,
these workloads must use `hostNetwork` for now
https://github.com/cilium/cilium/issues/12116
* Erroneous warning on Fedora CoreOS
https://github.com/cilium/cilium/issues/10256

Note: This is experimental. It is not listed in docs and may be
changed or removed without a deprecation notice

Related:

* https://github.com/poseidon/terraform-render-bootstrap/pull/192
* https://github.com/cilium/cilium/issues/12217
2020-06-21 20:41:53 -07:00
37f00a3882 Reduce Calcio MTU on Fedora CoreOS Azure
* Change the Calico VXLAN interface for MTU from 1450 to 1410
* VXLAN on Azure should support MTU 1450. However, there is
history where performance measures have shown that 1410 is
needed to have expected performance. Flatcar Linux has the
same MTU 1410 override and note
* FCOS 31.20200323.3.2 was known to perform fine with 1450, but
now in 31.20200517.3.0 the right value seems to be 1410
2020-06-19 00:24:56 -07:00
4cfafeaa07 Fix Kubelet starting before hostname set on FCOS AWS
* Fedora CoreOS `kubelet.service` can start before the hostname
is set. Kubelet reads the hostname to determine the node name to
register. If the hostname was read as localhost, Kubelet will
continue trying to register as localhost (problem)
* This race manifests as a node that appears NotReady, the Kubelet
is trying to register as localhost, while the host itself (by then)
has an AWS provided hostname. Restarting kubelet.service is a
manual fix so Kubelet re-reads the hostname
* This race could only be shown on AWS, not on Google Cloud or
Azure despite attempts. Bare-metal and DigitalOcean differ and
use hostname-override (e.g. afterburn) so they're not affected
* Wait for nodes to have a non-localhost hostname in the oneshot
that awaits /etc/resolve.conf. Typhoon has no valid cases for a
node hostname being localhost (not even single-node clusters)

Related Openshift: https://github.com/openshift/machine-config-operator/pull/1813
Close https://github.com/poseidon/typhoon/issues/765
2020-06-19 00:19:54 -07:00
90e23f5822 Rename controller node label and NoSchedule taint
* Remove node label `node.kubernetes.io/master` from controller nodes
* Use `node.kubernetes.io/controller` (present since v1.9.5,
[#160](https://github.com/poseidon/typhoon/pull/160)) to node select controllers
* Rename controller NoSchedule taint from `node-role.kubernetes.io/master` to
`node-role.kubernetes.io/controller`
* Tolerate the new taint name for workloads that may run on controller nodes
and stop tolerating `node-role.kubernetes.io/master` taint
2020-06-19 00:12:13 -07:00
6234147948 Update recommended Terraform provider versions
* Sync Terraform provider plugin versions with those
used internally
2020-06-18 01:03:37 -07:00
c25c59058c Update Kubernetes from v1.18.3 to v1.18.4
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1184
2020-06-17 19:53:19 -07:00
bc9b808d44 Update nginx-ingress from v0.32.0 to v0.33.0
* https://github.com/kubernetes/ingress-nginx/releases/tag/controller-0.33.0
2020-06-16 18:44:40 -07:00
4b0203fdb2 Fix typo in DigitalOcean docs title 2020-06-16 18:33:56 -07:00
331566e1f7 Update mkdocs packages for website 2020-06-16 18:20:19 -07:00
04520e447c Update node-exporter from v1.0.0 to v1.0.1
* https://github.com/prometheus/node_exporter/releases/tag/v1.0.1
2020-06-16 17:57:09 -07:00
413585681b Remove unused Kubelet lock-file and exit-on-lock-contention
* Kubelet `--lock-file` and `--exit-on-lock-contention` date
back to usage of bootkube and at one point running Kubelet
in a "self-hosted" style whereby an on-host Kubelet (rkt)
started pods, but then a Kubelet DaemonSet was scheduled
and able to take over (hence self-hosted). `lock-file` and
`exit-on-lock-contention` flags supported this pivot. The
pattern has been out of favor (in bootkube too) for years
because of dueling Kubelet complexity
* Typhoon runs Kubelet as a container via an on-host systemd
unit using podman (Fedora CoreOS) or rkt (Flatcar Linux). In
fact, Typhoon no longer uses bootkube or control plane pivot
(let alone Kubelet pivot) and uses static pods since v1.16.0
* https://github.com/poseidon/typhoon/pull/536
2020-06-12 00:06:41 -07:00
96711d7f17 Remove unused Kubelet cert / key Terraform state
* Generated Kubelet TLS certificate and key are not longer
used or distributed to machines since Kubelet TLS bootstrap
is used instead. Remove the certificate and key from state
2020-06-11 21:24:36 -07:00
c9059d3fe9 Update Prometheus from v2.19.0-rc.0 to v2.19.0
* https://github.com/prometheus/prometheus/releases/tag/v2.19.0
2020-06-09 23:05:03 -07:00
a287920169 Use strict mode for Container Linux Configs
* Enable terraform-provider-ct `strict` mode for parsing
Container Linux Configs and snippets
* Fix Container Linux Config systemd unit syntax `enable`
(old) to `enabled`
* Align with Fedora CoreOS which uses strict mode already
2020-06-09 23:00:36 -07:00
8dc170b9d9 Update security disclosure contact email
* Use security@psdn.io across github.com/poseidon projects
2020-06-08 12:37:09 -07:00
aed1a5f33d Fix Fedora CoreOS docs for selecting a stream
* Fedora CoreOS image `os_stream` stable, testing, and next
have been configurable since v1.18.3
* Remove mention of outdated `os_image` variable
2020-06-08 12:25:57 -07:00
31d02b0221 Update Prometheus from v2.18.1 to v2.19.0-rc.0
* https://github.com/prometheus/prometheus/releases/tag/v2.19.0-rc.0
2020-06-05 00:16:45 -07:00
8f875f80f5 Update Grafana from v7.0.1 to v7.0.3
* https://github.com/grafana/grafana/releases/tag/v7.0.2
* https://github.com/grafana/grafana/releases/tag/v7.0.3
2020-06-03 12:31:58 -07:00
16c0b9152b Update kube-state-metrics from v1.9.6 to v1.9.7
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.7
2020-06-03 11:35:10 -07:00
99dbce67a3 Tweak minor style elements of issue templates 2020-05-31 16:19:33 -07:00
20bfd69780 Change Kubelet container image publishing
* Build Kubelet container images internally and publish
to Quay and Dockerhub (new) as an alternative in case of
registry outage or breach
* Use our infra to provide single and multi-arch (default)
Kublet images for possible future use
* Docs: Show how to use alternative Kubelet images via
snippets and a systemd dropin (builds on #737)

Changes:

* Update docs with changes to Kubelet image building
* If you prefer to trust images built by Quay/Dockerhub,
automated image builds are still available with unique
tags (albeit with some limitations):
  * Quay automated builds are tagged `build-{short_sha}`
  (limit: only amd64)
  * Dockerhub automated builts are tagged `build-{tag}`
  and `build-master` (limit: only amd64, no shas)

Links:

* Kubelet: https://github.com/poseidon/kubelet
* Docs: https://typhoon.psdn.io/topics/security/#container-images
* Registries:
  * quay.io/poseidon/kubelet
  * docker.io/psdn/kubelet
2020-05-30 23:34:23 -07:00
ba44408b76 Update Calico from v3.14.0 to v3.14.1
* https://docs.projectcalico.org/v3.14/release-notes/
2020-05-30 22:08:37 -07:00
455175d9e6 Update the fallback issue template
* Even "blank" issues need to fill out the fallback
template
2020-05-28 00:06:59 -07:00
d45804b1f6 Update Github issue template to use drop-downs (#747)
* Create a stricter bug report template
* Highlight topics that are not accepted in issues: operation, support, debugging, advice, or Kubernetes concepts
* Add a section to strongly suggest bug reports link a PR or describe a solution. This may be able to weed out topics that aren't focused bug reports
2020-05-27 23:49:25 -07:00
907a96916f Update mkdocs-material from v5.2.0 to v5.2.2
* https://github.com/squidfunk/mkdocs-material/releases/tag/5.2.2
2020-05-27 21:49:40 -07:00
187bb17d39 Update Grafana from v7.0.0 to v7.0.1
* https://github.com/grafana/grafana/releases/tag/v7.0.1
2020-05-27 21:35:24 -07:00
abc31c3711 Update node-exporter from v1.0.0-rc.1 to v1.0.0
* https://github.com/prometheus/node_exporter/releases/tag/v1.0.0
2020-05-27 21:33:03 -07:00
283e14f3e0 Update recommended Terraform provider versions
* Sync Terraform provider plugin versions to those actively
used internally
* Fix terraform fmt
2020-05-22 01:12:53 -07:00
e72f916c8d Update etcd from v3.4.8 to v3.4.9
* https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.4.md#v349-2020-05-20
2020-05-22 00:52:20 -07:00
c52f9f8d08 Upgrade docs packages and refresh content
* Promote DigitalOcean from alpha to beta for Fedora
CoreOS and Flatcar Linux
* Upgrade mkdocs-material and PyPI packages for docs
* Replace docs mentions of Container Linux with Flatcar
Linux and move docs/cl to docs/flatcar-linux
* Deprecate CoreOS Container Linux support. Its still
usable for some time, but start removing docs
2020-05-20 23:31:26 -07:00
ecae6679ff Update Kubernetes from v1.18.2 to v1.18.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md
2020-05-20 20:37:39 -07:00
4760543356 Set Kubelet image via kubelet.service KUBELET_IMAGE
* Write the systemd kubelet.service to use `KUBELET_IMAGE`
as the Kubelet. This provides a nice way to use systemd
dropins to temporarily override the image (e.g. during a
registry outage)

Note: Only Typhoon Kubelet images and registries are supported.
2020-05-19 22:39:53 -07:00
09eb208b4e Fix Fedora CoreOS on GCP proposing controller recreate
* With Fedora CoreOS image stream support (#727), the latest
resolved image will change over the lifecycle of a cluster.
* Fix issue where an image diff proposed replacing a Fedora
CoreOS controller on GCP, introduced in #727 (unreleased)
* Also ignore image diffs to the GCP managed instance group
of workers. This aligns with worker AMI diffs being ignored
on AWS and similar on Azure, since workers update themselves.

Background:

* Controller nodes should strictly not be recreated by Terraform,
they are stateful (etcd) and should not be replaced
* Across cloud platforms, OS image diffs are ignored since both
Flatcar Linux and Fedora CoreOS nodes update themselves. For
workers, user-data or disk size diffs (where relevant) are allowed
to recreate workers templates/configs since these are considered
to be user-initiated declarations that a reprovision should be done
2020-05-19 21:41:51 -07:00
8d024d22ad Update etcd from v3.4.7 to v3.4.8
* https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.4.md#v348-2020-05-18
2020-05-18 23:50:46 -07:00
3bdddc452c Update Grafana from v7.0.0-beta2 to v7.0.0
* https://grafana.com/docs/grafana/latest/guides/whats-new-in-v7-0/
2020-05-18 23:42:32 -07:00
ff4187a1fb Use new Azure subnet to set address_prefixes list
* Update Azure subnet `address_prefix` to `azure_prefixes` list
* Fix warning that `address_prefix` is deprecated
* Require `terraform-provider-azurerm` v2.8.0+ (action required)

Rel: https://github.com/terraform-providers/terraform-provider-azurerm/pull/6493
2020-05-18 23:35:47 -07:00
2578be1f96 Rollback Grafana to v7.0.0-beta3, v7.0.0 image is missing
* Grafana hasn't published the v7.0.0 image yet
2020-05-16 12:32:10 -07:00
90edcd3d77 Update node-exporter from v1.0.0-rc.0 to v1.0.0-rc.1
* https://github.com/prometheus/node_exporter/releases/tag/v1.0.0-rc.1
2020-05-15 18:03:19 -07:00
a927c7c790 Update kube-state-metrics from v1.9.5 to v1.9.6
* https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.6
2020-05-15 17:42:24 -07:00
d952576d2f Update Grafana from v7.0.0-beta3 to v7.0.0
* https://github.com/grafana/grafana/releases/tag/7.0.0
2020-05-15 17:38:59 -07:00
70e389f37f Restore use of Flatcar Linux Azure Marketplace image
* Switch Flatcar Linux Azure to use the Marketplace image
from Kinvolk (offer `flatcar-container-linux-free`)
* Accepting Azure Marketplace terms is still neccessary,
update docs to show accepting the free offer rather than
BYOL

* Upstream Flatcar: https://github.com/flatcar-linux/Flatcar/issues/82
* Typhoon: https://github.com/poseidon/typhoon/issues/703
2020-05-13 22:50:24 -07:00
a18bd0a707 Highlight SELinux enforcing mode in features 2020-05-13 21:57:38 -07:00
01905b00bc Support Fedora CoreOS OS image streams on AWS
* Add `os_stream` variable to set the stream to stable (default),
testing, or next
* Remove unused os_image variable on Fedora CoreOS AWS
2020-05-13 21:45:12 -07:00
f4194cd57a Update Grafana from v7.0.0-beta2 to v7.0.0-beta.3
* https://github.com/grafana/grafana/releases/tag/v7.0.0-beta3
2020-05-09 17:50:40 -07:00
a2db4fa8c4 Update Calico from v3.13.3 to v3.14.0
* https://docs.projectcalico.org/v3.14/release-notes/
2020-05-09 16:05:30 -07:00
358854e712 Fix Calico install-cni crash loop on Pod restarts
* Set a consistent MCS level/range for Calico install-cni
* Note: Rebooting a node was a workaround, because Kubelet
relabels /etc/kubernetes(/cni/net.d)

Background:

* On SELinux enforcing systems, the Calico CNI install-cni
container ran with default SELinux context and a random MCS
pair. install-cni places CNI configs by first creating a
temporary file and then moving them into place, which means
the file MCS categories depend on the containers SELinux
context.
* calico-node Pod restarts creates a new install-cni container
with a different MCS pair that cannot access the earlier
written file (it places configs every time), causing the
init container to error and calico-node to crash loop
* https://github.com/projectcalico/cni-plugin/issues/874

```
mv: inter-device move failed: '/calico.conf.tmp' to
'/host/etc/cni/net.d/10-calico.conflist'; unable to remove target:
Permission denied
Failed to mv files. This may be caused by selinux configuration on
the
host, or something else.
```

Note, this isn't a host SELinux configuration issue.

Related:

* https://github.com/poseidon/terraform-render-bootstrap/pull/186
2020-05-09 16:01:44 -07:00
b5dabcea31 Use Fedora CoreOS image streams on Google Cloud
* Add `os_stream` variable to set a Fedora CoreOS stream
to `stable` (default), `testing`, or `next`
* Deprecate `os_image` variable. Remove docs about uploading
Fedora CoreOS images manually, this is no longer needed
* https://docs.fedoraproject.org/en-US/fedora-coreos/update-streams/

Rel: https://github.com/coreos/fedora-coreos-docs/pull/70
2020-05-08 01:23:12 -07:00
3f0a5d2715 Update Grafana from v7.0.0-beta1 to v7.0.0-beta2
* https://github.com/grafana/grafana/releases/tag/v7.0.0-beta2
2020-05-07 23:04:44 -07:00
33173c0206 Update Prometheus from v2.18.0 to v2.18.1
* https://github.com/prometheus/prometheus/releases/tag/v2.18.1
2020-05-07 22:59:11 -07:00
70f30d9c07 Update Prometheus from v2.18.0-rc.1 to v2.18.0
* https://github.com/prometheus/prometheus/releases/tag/v2.18.0
2020-05-05 22:31:11 -07:00
6afc1643d9 Update nginx-ingress from v0.30.0 to v0.32.0
* Add support for IngressClass and RBAC authorization
* Since our nginx ingress controller example uses the flag
`--ingress-class=public`, add an IngressClass to go along
with it

Rel: https://kubernetes.io/docs/concepts/services-networking/ingress/#ingress-class
2020-05-03 23:24:19 -07:00
e71e27e769 Update Prometheus from v2.17.2 to v2.18.0-rc.1
* https://github.com/prometheus/prometheus/releases/tag/v2.18.0-rc.1
2020-04-29 20:57:48 -07:00
64035005d4 Update Grafana from v6.7.2 to v7.0.0-beta1
* https://github.com/grafana/grafana/releases/tag/v7.0.0-beta1
2020-04-29 20:53:30 -07:00
317416b316 Use Terraform element wrap-around for AWS controllers subnet_id (#714)
* Fix Terraform plan error when controller_count exceeds available AWS zones (e.g. 5 controllers)
2020-04-29 20:41:08 -07:00
2c1af917ec Update recommended Terraform provider versions
* Sync the Terraform provider plugin versions to those
actively used and tested by the author
* Fix terraform fmt
2020-04-28 19:57:50 -07:00
4ac2d94999 Add Fedora CoreOS Azure docs to site navigation
* Fix missing Fedora CoreOS Azure docs
2020-04-28 19:54:37 -07:00
fd044ee117 Enable Kubelet TLS bootstrap and NodeRestriction
* Enable bootstrap token authentication on kube-apiserver
* Generate the bootstrap.kubernetes.io/token Secret that
may be used as a bootstrap token
* Generate a bootstrap kubeconfig (with a bootstrap token)
to be securely distributed to nodes. Each Kubelet will use
the bootstrap kubeconfig to authenticate to kube-apiserver
as `system:bootstrappers` and send a node-unique CSR for
kube-controller-manager to automatically approve to issue
a Kubelet certificate and kubeconfig (expires in 72 hours)
* Add ClusterRoleBinding for bootstrap token subjects
(`system:bootstrappers`) to have the `system:node-bootstrapper`
ClusterRole
* Add ClusterRoleBinding for bootstrap token subjects
(`system:bootstrappers`) to have the csr nodeclient ClusterRole
* Add ClusterRoleBinding for bootstrap token subjects
(`system:bootstrappers`) to have the csr selfnodeclient ClusterRole
* Enable NodeRestriction admission controller to limit the
scope of Node or Pod objects a Kubelet can modify to those of
the node itself
* Ability for a Kubelet to delete its Node object is retained
as preemptible nodes or those in auto-scaling instance groups
need to be able to remove themselves on shutdown. This need
continues to have precedence over any risk of a node deleting
itself maliciously

Security notes:

1. Issued Kubelet certificates authenticate as user `system:node:NAME`
and group `system:nodes` and are limited in their authorization
to perform API operations by Node authorization and NodeRestriction
admission. Previously, a Kubelet's authorization was broader. This
is the primary security motivation.

2. The bootstrap kubeconfig credential has the same sensitivity
as the previous generated TLS client-certificate kubeconfig.
It must be distributed securely to nodes. Its compromise still
allows an attacker to obtain a Kubelet kubeconfig

3. Bootstrapping Kubelet kubeconfig's with a limited lifetime offers
a slight security improvement.
  * An attacker who obtains the kubeconfig can likely obtain the
  bootstrap kubeconfig as well, to obtain the ability to renew
  their access
  * A compromised bootstrap kubeconfig could plausibly be handled
  by replacing the bootstrap token Secret, distributing the token
  to new nodes, and expiration. Whereas a compromised TLS-client
  certificate kubeconfig can't be revoked (no CRL). However,
  replacing a bootstrap token can be impractical in real cluster
  environments, so the limited lifetime is mostly a theoretical
  benefit.
  * Cluster CSR objects are visible via kubectl which is nice

4. Bootstrapping node-unique Kubelet kubeconfigs means Kubelet
clients have more identity information, which can improve the
utility of audits and future features

Rel: https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-tls-bootstrapping/
Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/185
2020-04-28 19:35:33 -07:00
38a6bddd06 Update Calico from v3.13.1 to v3.13.3
* https://docs.projectcalico.org/v3.13/release-notes/
2020-04-23 23:58:02 -07:00
d8966afdda Remove extraneous sudo from layout asset unpacking 2020-04-22 20:28:01 -07:00
84ed0a31c3 Update Prometheus from v2.17.1 to v2.17.2
* https://github.com/prometheus/prometheus/releases/tag/v2.17.2
2020-04-20 18:09:24 -07:00
136 changed files with 1682 additions and 958 deletions

View File

@ -1,33 +0,0 @@
<!-- Fill in either the 'Bug' or 'Feature Request' section -->
## Bug
### Environment
* Platform: aws, azure, bare-metal, google-cloud, digital-ocean
* OS: fedora-coreos, flatcar-linux
* Release: Typhoon version or Git SHA (reporting latest is **not** helpful)
* Terraform: `terraform version` (reporting latest is **not** helpful)
* Plugins: Provider plugin versions (reporting latest is **not** helpful)
### Problem
Describe the problem.
### Desired Behavior
Describe the goal.
### Steps to Reproduce
Provide clear steps to reproduce the issue unless already covered.
## Feature Request
### Feature
Describe the feature and what problem it solves.
### Tradeoffs
What are the pros and cons of this feature? How will it be exercised and maintained?

39
.github/ISSUE_TEMPLATE/bug_report.md vendored Normal file
View File

@ -0,0 +1,39 @@
---
name: Bug report
about: Report a bug to improve the project
title: ''
labels: ''
assignees: ''
---
<!-- READ: Issues are used to receive focused bug reports from users and to track planned future enhancements by the authors. Topics like cluster operation, support, debugging help, advice, and Kubernetes concepts are out of scope and should not use issues-->
**Description**
A clear and concise description of what the bug is.
**Steps to Reproduce**
Provide clear steps to reproduce the bug.
- [ ] Relevant error messages if appropriate (concise, not a dump of everything).
- [ ] Explored using a vanilla cluster from the [tutorials](https://typhoon.psdn.io/#documentation). Ruled out [customizations](https://typhoon.psdn.io/advanced/customization/).
**Expected behavior**
A clear and concise description of what you expected to happen.
**Environment**
* Platform: aws, azure, bare-metal, google-cloud, digital-ocean
* OS: fedora-coreos, flatcar-linux (include release version)
* Release: Typhoon version or Git SHA (reporting latest is **not** helpful)
* Terraform: `terraform version` (reporting latest is **not** helpful)
* Plugins: Provider plugin versions (reporting latest is **not** helpful)
**Possible Solution**
<!-- Most bug reports should have some inkling about solutions. Otherwise, your report may be less of a bug and more of a support request (see top).-->
Link to a PR or description.

5
.github/ISSUE_TEMPLATE/config.yml vendored Normal file
View File

@ -0,0 +1,5 @@
blank_issues_enabled: true
contact_links:
- name: Security
url: https://typhoon.psdn.io/topics/security/
about: Report security vulnerabilities

15
.github/issue_template.md vendored Normal file
View File

@ -0,0 +1,15 @@
<!-- READ: Issues are used to receive focused bug reports from users and to track planned future enhancements by the authors. Topics like cluster operation, support, debugging help, advice, and Kubernetes concepts are out of scope and should not use issues-->
## Enhancement
### Overview
One paragraph explanation of the enhancement.
### Motivation
Describe the motivation and what problem this solves.
### Tradeoffs
What are the pros and cons of this feature? How will it be exercised and maintained?

View File

@ -4,6 +4,161 @@ Notable changes between versions.
## Latest
## v1.18.6
* Kubernetes [v1.18.6](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1186)
* Update Calico from v3.15.0 to [v3.15.1](https://docs.projectcalico.org/v3.15/release-notes/)
* Update Cilium from v1.8.0 to [v1.8.1](https://github.com/cilium/cilium/releases/tag/v1.8.1)
#### Addons
* Update nginx-ingress from v0.33.0 to [v0.34.1](https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.34.1)
* [ingress-nginx](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.34.0) will publish images only to gcr.io
* Update Prometheus from v2.19.1 to [v2.19.2](https://github.com/prometheus/prometheus/releases/tag/v2.19.2)
* Update Grafana from v7.0.4 to [v7.0.6](https://github.com/grafana/grafana/releases/tag/v7.0.6)
## v1.18.5
* Kubernetes [v1.18.5](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1185)
* Add Cilium v1.8.0 as a (experimental) CNI provider option ([#760](https://github.com/poseidon/typhoon/pull/760))
* Set `networking` to "cilium" to enable
* Update Calico from v3.14.1 to [v3.15.0](https://docs.projectcalico.org/v3.15/release-notes/)
#### DigitalOcean
* Isolate each cluster in an independent DigitalOcean VPC ([#776](https://github.com/poseidon/typhoon/pull/776))
* Create droplets in a VPC per cluster (matches Typhoon AWS, Azure, and GCP)
* Require `terraform-provider-digitalocean` v1.16.0+ (action required)
* Output `vpc_id` for use with an attached DigitalOcean [loadbalancer](https://github.com/poseidon/typhoon/blob/v1.18.5/docs/architecture/digitalocean.md#custom-load-balancer)
### Fedora CoreOS
#### Google Cloud
* Promote Fedora CoreOS to stable
* Remove `os_image` variable deprecated in v1.18.3 ([#777](https://github.com/poseidon/typhoon/pull/777))
* Use `os_stream` to select a Fedora CoreOS image stream
### Flatcar Linux
#### Azure
* Allow using Flatcar Linux Edge by setting `os_image` to "flatcar-edge" ([#778](https://github.com/poseidon/typhoon/pull/778))
#### Addons
* Update Prometheus from v2.19.0 to [v2.19.1](https://github.com/prometheus/prometheus/releases/tag/v2.19.1)
* Update Grafana from v7.0.3 to [v7.0.4](https://github.com/grafana/grafana/releases/tag/v7.0.4)
## v1.18.4
* Kubernetes [v1.18.4](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1184)
* Update Kubelet image publishing ([#749](https://github.com/poseidon/typhoon/pull/749))
* Build Kubelet images internally and publish to Quay and Dockerhub
* [quay.io/poseidon/kubelet](https://quay.io/repository/poseidon/kubelet) (official)
* [docker.io/psdn/kubelet](https://hub.docker.com/r/psdn/kubelet) (fallback)
* Continue offering automated image builds with an alternate tag strategy (see [docs](https://typhoon.psdn.io/topics/security/#container-images))
* [Document](https://typhoon.psdn.io/advanced/customization/#kubelet) use of alternate Kubelet images during registry incidents
* Update Calico from v3.14.0 to [v3.14.1](https://docs.projectcalico.org/v3.14/release-notes/)
* Fix [CVE-2020-13597](https://github.com/kubernetes/kubernetes/issues/91507)
* Rename controller NoSchedule taint from `node-role.kubernetes.io/master` to `node-role.kubernetes.io/controller` ([#764](https://github.com/poseidon/typhoon/pull/764))
* Tolerate the new taint name for workloads that may run on controller nodes
* Remove node label `node.kubernetes.io/master` from controller nodes ([#764](https://github.com/poseidon/typhoon/pull/764))
* Use `node.kubernetes.io/controller` (present since v1.9.5, [#160](https://github.com/poseidon/typhoon/pull/160)) to node select controllers
* Remove unused Kubelet `-lock-file` and `-exit-on-lock-contention` ([#758](https://github.com/poseidon/typhoon/pull/758))
### Fedora CoreOS
#### Azure
* Use `strict` Fedora CoreOS Config (FCC) snippet parsing ([#755](https://github.com/poseidon/typhoon/pull/755))
* Reduce Calico vxlan interface MTU to maintain performance ([#767](https://github.com/poseidon/typhoon/pull/766))
#### AWS
* Fix Kubelet service race with hostname update ([#766](https://github.com/poseidon/typhoon/pull/766))
* Wait for a hostname to avoid Kubelet trying to register as `localhost`
### Flatcar Linux
* Use `strict` Container Linux Config (CLC) snippet parsing ([#755](https://github.com/poseidon/typhoon/pull/755))
* Require `terraform-provider-ct` v0.4+, recommend v0.5+ (**action required**)
### Addons
* Update nginx-ingress from v0.32.0 to [v0.33.0](https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.33.0)
* Update Prometheus from v2.18.1 to [v2.19.0](https://github.com/prometheus/prometheus/releases/tag/v2.19.0)
* Update node-exporter from v1.0.0-rc.1 to [v1.0.1](https://github.com/prometheus/node_exporter/releases/tag/v1.0.1)
* Update kube-state-metrics from v1.9.6 to v1.9.7
* Update Grafana from v7.0.0 to v7.0.3
## v1.18.3
* Kubernetes [v1.18.3](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1183)
* Use Kubelet [TLS bootstrap](https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-tls-bootstrapping/) with bootstrap token authentication ([#713](https://github.com/poseidon/typhoon/pull/713))
* Enable Node [Authorization](https://kubernetes.io/docs/reference/access-authn-authz/node/) and [NodeRestriction](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#noderestriction) to reduce authorization scope
* Renew Kubelet certificates every 72 hours
* Update etcd from v3.4.7 to [v3.4.9](https://github.com/etcd-io/etcd/releases/tag/v3.4.9)
* Update Calico from v3.13.1 to [v3.14.0](https://docs.projectcalico.org/v3.14/release-notes/)
* Add CoreDNS node affinity preference for controller nodes ([#188](https://github.com/poseidon/terraform-render-bootstrap/pull/188))
* Deprecate CoreOS Container Linux support (no OS [updates](https://coreos.com/os/eol/) after May 2020)
* Use a `fedora-coreos` module for Fedora CoreOS
* Use a `container-linux` module for Flatcar Linux
### AWS
* Fix Terraform plan error when `controller_count` exceeds AWS zones (e.g. 5 controllers) ([#714](https://github.com/poseidon/typhoon/pull/714))
* Regressed in v1.17.1 ([#605](https://github.com/poseidon/typhoon/pull/605))
### Azure
* Update Azure subnets to set `address_prefixes` list ([#730](https://github.com/poseidon/typhoon/pull/730))
* Fix warning that `address_prefix` is deprecated
* Require `terraform-provider-azurerm` v2.8.0+ (action required)
### DigitalOcean
* Promote DigitalOcean to beta on both Fedora CoreOS and Flatcar Linux
### Fedora CoreOS
* Fix Calico `install-cni` crashloop on Pod restarts ([#724](https://github.com/poseidon/typhoon/pull/724))
* SELinux enforcement requires consistent file context MCS level
* Restarting a node resolved the issue as a previous workaround
#### AWS
* Support Fedora CoreOS [image streams](https://docs.fedoraproject.org/en-US/fedora-coreos/update-streams/) ([#727](https://github.com/poseidon/typhoon/pull/727))
* Add `os_stream` variable to set the stream to `stable` (default), `testing`, or `next`
* Remove unused `os_image` variable
#### Google
* Support Fedora CoreOS [image streams](https://docs.fedoraproject.org/en-US/fedora-coreos/update-streams/) ([#723](https://github.com/poseidon/typhoon/pull/723))
* Add `os_stream` variable to set the stream to `stable` (default), `testing`, or `next`
* Deprecate `os_image` variable. Manual image uploads are no longer needed
### Flatcar Linux
#### Azure
* Use the Flatcar Linux Azure Marketplace image
* Restore [#664](https://github.com/poseidon/typhoon/pull/664) (reverted in [#707](https://github.com/poseidon/typhoon/pull/707)) but use Flatcar Linux new free offer (not byol)
* Change `os_image` to use a `flatcar-stable` default
#### Google
* Promote Flatcar Linux to beta
### Addons
* Update nginx-ingress from v0.30.0 to [v0.32.0](https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.32.0)
* Add support for [IngressClass](https://kubernetes.io/docs/concepts/services-networking/ingress/#ingress-class)
* Update Prometheus from v2.17.1 to v2.18.1
* Update kube-state-metrics from v1.9.5 to [v1.9.6](https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.6)
* Update node-exporter from v1.0.0-rc.0 to [v1.0.0-rc.1](https://github.com/prometheus/node_exporter/releases/tag/v1.0.0-rc.1)
* Update Grafana from v6.7.2 to [v7.0.0](https://grafana.com/docs/grafana/latest/guides/whats-new-in-v7-0/)
## v1.18.2
* Kubernetes [v1.18.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1182)

View File

@ -11,9 +11,9 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.18.2 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Kubernetes v1.18.6 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [preemptible](https://typhoon.psdn.io/cl/google-cloud/#preemption) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
* Ready for Ingress, Prometheus, Grafana, CSI, or other [addons](https://typhoon.psdn.io/addons/overview/)
@ -28,35 +28,25 @@ Typhoon is available for [Fedora CoreOS](https://getfedora.org/coreos/).
| AWS | Fedora CoreOS | [aws/fedora-coreos/kubernetes](aws/fedora-coreos/kubernetes) | stable |
| Azure | Fedora CoreOS | [azure/fedora-coreos/kubernetes](azure/fedora-coreos/kubernetes) | alpha |
| Bare-Metal | Fedora CoreOS | [bare-metal/fedora-coreos/kubernetes](bare-metal/fedora-coreos/kubernetes) | beta |
| DigitalOcean | Fedora CoreOS | [digital-ocean/fedora-coreos/kubernetes](digital-ocean/fedora-coreos/kubernetes) | alpha |
| Google Cloud | Fedora CoreOS | [google-cloud/fedora-coreos/kubernetes](google-cloud/fedora-coreos/kubernetes) | beta |
| DigitalOcean | Fedora CoreOS | [digital-ocean/fedora-coreos/kubernetes](digital-ocean/fedora-coreos/kubernetes) | beta |
| Google Cloud | Fedora CoreOS | [google-cloud/fedora-coreos/kubernetes](google-cloud/fedora-coreos/kubernetes) | stable |
Typhoon is available for [Flatcar Container Linux](https://www.flatcar-linux.org/releases/).
Typhoon is available for [Flatcar Linux](https://www.flatcar-linux.org/releases/).
| Platform | Operating System | Terraform Module | Status |
|---------------|------------------|------------------|--------|
| AWS | Flatcar Linux | [aws/container-linux/kubernetes](aws/container-linux/kubernetes) | stable |
| Azure | Flatcar Linux | [azure/container-linux/kubernetes](azure/container-linux/kubernetes) | alpha |
| Bare-Metal | Flatcar Linux | [bare-metal/container-linux/kubernetes](bare-metal/container-linux/kubernetes) | stable |
| DigitalOcean | Flatcar Linux | [digital-ocean/container-linux/kubernetes](digital-ocean/container-linux/kubernetes) | alpha |
| Google Cloud | Flatcar Linux | [google-cloud/container-linux/kubernetes](google-cloud/container-linux/kubernetes) | alpha |
Typhoon is available for CoreOS Container Linux ([no updates](https://coreos.com/os/eol/) after May 2020).
| Platform | Operating System | Terraform Module | Status |
|---------------|------------------|------------------|--------|
| AWS | Container Linux | [aws/container-linux/kubernetes](aws/container-linux/kubernetes) | stable |
| Azure | Container Linux | [azure/container-linux/kubernetes](azure/container-linux/kubernetes) | alpha |
| Bare-Metal | Container Linux | [bare-metal/container-linux/kubernetes](bare-metal/container-linux/kubernetes) | stable |
| Digital Ocean | Container Linux | [digital-ocean/container-linux/kubernetes](digital-ocean/container-linux/kubernetes) | beta |
| Google Cloud | Container Linux | [google-cloud/container-linux/kubernetes](google-cloud/container-linux/kubernetes) | stable |
| DigitalOcean | Flatcar Linux | [digital-ocean/container-linux/kubernetes](digital-ocean/container-linux/kubernetes) | beta |
| Google Cloud | Flatcar Linux | [google-cloud/container-linux/kubernetes](google-cloud/container-linux/kubernetes) | beta |
## Documentation
* [Docs](https://typhoon.psdn.io)
* Architecture [concepts](https://typhoon.psdn.io/architecture/concepts/) and [operating systems](https://typhoon.psdn.io/architecture/operating-systems/)
* Fedora CoreOS tutorials for [AWS](docs/fedora-coreos/aws.md), [Azure](docs/fedora-coreos/azure.md), [Bare-Metal](docs/fedora-coreos/bare-metal.md), [DigitalOcean](docs/fedora-coreos/digitalocean.md), and [Google Cloud](docs/fedora-coreos/google-cloud.md)
* Flatcar Linux tutorials for [AWS](docs/cl/aws.md), [Azure](docs/cl/azure.md), [Bare-Metal](docs/cl/bare-metal.md), [DigitalOcean](docs/cl/digital-ocean.md), and [Google Cloud](docs/cl/google-cloud.md)
* Flatcar Linux tutorials for [AWS](docs/flatcar-linux/aws.md), [Azure](docs/flatcar-linux/azure.md), [Bare-Metal](docs/flatcar-linux/bare-metal.md), [DigitalOcean](docs/flatcar-linux/digitalocean.md), and [Google Cloud](docs/flatcar-linux/google-cloud.md)
## Usage
@ -64,7 +54,7 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo
```tf
module "yavin" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.18.2"
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.18.6"
# Google Cloud
cluster_name = "yavin"
@ -103,9 +93,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
$ export KUBECONFIG=/home/user/.kube/configs/yavin-config
$ kubectl get nodes
NAME ROLES STATUS AGE VERSION
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.18.2
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.18.2
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.18.2
yavin-controller-0.c.example-com.internal <none> Ready 6m v1.18.6
yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.18.6
yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.18.6
```
List the pods.

View File

@ -23,7 +23,7 @@ spec:
spec:
containers:
- name: grafana
image: docker.io/grafana/grafana:6.7.2
image: docker.io/grafana/grafana:7.0.6
env:
- name: GF_PATHS_CONFIG
value: "/etc/grafana/custom.ini"

View File

@ -0,0 +1,6 @@
apiVersion: networking.k8s.io/v1beta1
kind: IngressClass
metadata:
name: public
spec:
controller: k8s.io/ingress-nginx

View File

@ -22,7 +22,7 @@ spec:
spec:
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.30.0
image: us.gcr.io/k8s-artifacts-prod/ingress-nginx/controller:v0.34.1
args:
- /nginx-ingress-controller
- --ingress-class=public

View File

@ -51,3 +51,12 @@ rules:
- ingresses/status
verbs:
- update
- apiGroups:
- "networking.k8s.io"
resources:
- ingressclasses
verbs:
- get
- list
- watch

View File

@ -0,0 +1,6 @@
apiVersion: networking.k8s.io/v1beta1
kind: IngressClass
metadata:
name: public
spec:
controller: k8s.io/ingress-nginx

View File

@ -22,7 +22,7 @@ spec:
spec:
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.30.0
image: us.gcr.io/k8s-artifacts-prod/ingress-nginx/controller:v0.34.1
args:
- /nginx-ingress-controller
- --ingress-class=public

View File

@ -51,3 +51,12 @@ rules:
- ingresses/status
verbs:
- update
- apiGroups:
- "networking.k8s.io"
resources:
- ingressclasses
verbs:
- get
- list
- watch

View File

@ -0,0 +1,6 @@
apiVersion: networking.k8s.io/v1beta1
kind: IngressClass
metadata:
name: public
spec:
controller: k8s.io/ingress-nginx

View File

@ -22,7 +22,7 @@ spec:
spec:
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.30.0
image: us.gcr.io/k8s-artifacts-prod/ingress-nginx/controller:v0.34.1
args:
- /nginx-ingress-controller
- --ingress-class=public

View File

@ -51,3 +51,12 @@ rules:
- ingresses/status
verbs:
- update
- apiGroups:
- "networking.k8s.io"
resources:
- ingressclasses
verbs:
- get
- list
- watch

View File

@ -0,0 +1,6 @@
apiVersion: networking.k8s.io/v1beta1
kind: IngressClass
metadata:
name: public
spec:
controller: k8s.io/ingress-nginx

View File

@ -22,7 +22,7 @@ spec:
spec:
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.30.0
image: us.gcr.io/k8s-artifacts-prod/ingress-nginx/controller:v0.34.1
args:
- /nginx-ingress-controller
- --ingress-class=public

View File

@ -51,3 +51,12 @@ rules:
- ingresses/status
verbs:
- update
- apiGroups:
- "networking.k8s.io"
resources:
- ingressclasses
verbs:
- get
- list
- watch

View File

@ -0,0 +1,6 @@
apiVersion: networking.k8s.io/v1beta1
kind: IngressClass
metadata:
name: public
spec:
controller: k8s.io/ingress-nginx

View File

@ -22,7 +22,7 @@ spec:
spec:
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.30.0
image: us.gcr.io/k8s-artifacts-prod/ingress-nginx/controller:v0.34.1
args:
- /nginx-ingress-controller
- --ingress-class=public

View File

@ -51,3 +51,12 @@ rules:
- ingresses/status
verbs:
- update
- apiGroups:
- "networking.k8s.io"
resources:
- ingressclasses
verbs:
- get
- list
- watch

View File

@ -20,7 +20,7 @@ spec:
serviceAccountName: prometheus
containers:
- name: prometheus
image: quay.io/prometheus/prometheus:v2.17.1
image: quay.io/prometheus/prometheus:v2.19.2
args:
- --web.listen-address=0.0.0.0:9090
- --config.file=/etc/prometheus/prometheus.yaml

View File

@ -24,7 +24,7 @@ spec:
serviceAccountName: kube-state-metrics
containers:
- name: kube-state-metrics
image: quay.io/coreos/kube-state-metrics:v1.9.5
image: quay.io/coreos/kube-state-metrics:v1.9.7
ports:
- name: metrics
containerPort: 8080

View File

@ -28,7 +28,7 @@ spec:
hostPID: true
containers:
- name: node-exporter
image: quay.io/prometheus/node-exporter:v1.0.0-rc.0
image: quay.io/prometheus/node-exporter:v1.0.1
args:
- --path.procfs=/host/proc
- --path.sysfs=/host/sys

View File

@ -882,10 +882,10 @@ data:
{
"alert": "KubeClientCertificateExpiration",
"annotations": {
"message": "A client certificate used to authenticate to the apiserver is expiring in less than 7.0 days.",
"message": "A client certificate used to authenticate to the apiserver is expiring in less than 1.0 hours.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclientcertificateexpiration"
},
"expr": "apiserver_client_certificate_expiration_seconds_count{job=\"apiserver\"} > 0 and on(job) histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job=\"apiserver\"}[5m]))) < 604800\n",
"expr": "apiserver_client_certificate_expiration_seconds_count{job=\"apiserver\"} > 0 and on(job) histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job=\"apiserver\"}[5m]))) < 3600\n",
"labels": {
"severity": "warning"
}
@ -893,10 +893,10 @@ data:
{
"alert": "KubeClientCertificateExpiration",
"annotations": {
"message": "A client certificate used to authenticate to the apiserver is expiring in less than 24.0 hours.",
"message": "A client certificate used to authenticate to the apiserver is expiring in less than 0.1 hours.",
"runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclientcertificateexpiration"
},
"expr": "apiserver_client_certificate_expiration_seconds_count{job=\"apiserver\"} > 0 and on(job) histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job=\"apiserver\"}[5m]))) < 86400\n",
"expr": "apiserver_client_certificate_expiration_seconds_count{job=\"apiserver\"} > 0 and on(job) histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job=\"apiserver\"}[5m]))) < 300\n",
"labels": {
"severity": "critical"
}

View File

@ -11,11 +11,11 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.18.2 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* Kubernetes v1.18.6 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/cl/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
* Ready for Ingress, Prometheus, Grafana, CSI, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
## Docs

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests)
module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=14d0b2087962a0f2557c184f3f523548ce19bbdc"
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=835890025bfb3499aa0cd5a9704e9e7d3e7e5fe5"
cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]

View File

@ -2,12 +2,12 @@
systemd:
units:
- name: etcd-member.service
enable: true
enabled: true
dropins:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.4.7"
Environment="ETCD_IMAGE_TAG=v3.4.9"
Environment="ETCD_IMAGE_URL=docker://quay.io/coreos/etcd"
Environment="RKT_RUN_ARGS=--insecure-options=image"
Environment="ETCD_NAME=${etcd_name}"
@ -28,11 +28,11 @@ systemd:
Environment="ETCD_PEER_KEY_FILE=/etc/ssl/certs/etcd/peer.key"
Environment="ETCD_PEER_CLIENT_CERT_AUTH=true"
- name: docker.service
enable: true
enabled: true
- name: locksmithd.service
mask: true
- name: wait-for-dns.service
enable: true
enabled: true
contents: |
[Unit]
Description=Wait for DNS entries
@ -46,12 +46,13 @@ systemd:
RequiredBy=kubelet.service
RequiredBy=etcd-member.service
- name: kubelet.service
enable: true
enabled: true
contents: |
[Unit]
Description=Kubelet via Hyperkube
Description=Kubelet
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.6
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
@ -91,25 +92,24 @@ systemd:
--mount volume=var-log,target=/var/log \
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
--mount volume=opt-cni-bin,target=/opt/cni/bin \
docker://quay.io/poseidon/kubelet:v1.18.2 -- \
$${KUBELET_IMAGE} -- \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--healthz-port=0 \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/master \
--node-labels=node.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
--register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
@ -134,7 +134,7 @@ systemd:
--volume script,kind=host,source=/opt/bootstrap/apply \
--mount volume=script,target=/apply \
--insecure-options=image \
docker://quay.io/poseidon/kubelet:v1.18.2 \
docker://quay.io/poseidon/kubelet:v1.18.6 \
--net=host \
--dns=host \
--exec=/apply
@ -165,11 +165,11 @@ storage:
chmod -R 500 /etc/ssl/etcd
mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/
mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/
sudo mkdir -p /etc/kubernetes/manifests
sudo mv static-manifests/* /etc/kubernetes/manifests/
sudo mkdir -p /opt/bootstrap/assets
sudo mv manifests /opt/bootstrap/assets/manifests
sudo mv manifests-networking/* /opt/bootstrap/assets/manifests/
mkdir -p /etc/kubernetes/manifests
mv static-manifests/* /etc/kubernetes/manifests/
mkdir -p /opt/bootstrap/assets
mv manifests /opt/bootstrap/assets/manifests
mv manifests-networking/* /opt/bootstrap/assets/manifests/
rm -rf assets auth static-manifests tls manifests-networking
- path: /opt/bootstrap/apply
filesystem: root
@ -188,6 +188,7 @@ storage:
done
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
mode: 0644
contents:
inline: |
fs.inotify.max_user_watches=16184

View File

@ -36,7 +36,7 @@ resource "aws_instance" "controllers" {
# network
associate_public_ip_address = true
subnet_id = aws_subnet.public.*.id[count.index]
subnet_id = element(aws_subnet.public.*.id, count.index)
vpc_security_group_ids = [aws_security_group.controller.id]
lifecycle {
@ -49,10 +49,10 @@ resource "aws_instance" "controllers" {
# Controller Ignition configs
data "ct_config" "controller-ignitions" {
count = var.controller_count
content = data.template_file.controller-configs.*.rendered[count.index]
pretty_print = false
snippets = var.controller_snippets
count = var.controller_count
content = data.template_file.controller-configs.*.rendered[count.index]
strict = true
snippets = var.controller_snippets
}
# Controller Container Linux configs

View File

@ -13,6 +13,30 @@ resource "aws_security_group" "controller" {
}
}
resource "aws_security_group_rule" "controller-icmp" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.controller.id
type = "ingress"
protocol = "icmp"
from_port = 8
to_port = 0
source_security_group_id = aws_security_group.worker.id
}
resource "aws_security_group_rule" "controller-icmp-self" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.controller.id
type = "ingress"
protocol = "icmp"
from_port = 8
to_port = 0
self = true
}
resource "aws_security_group_rule" "controller-ssh" {
security_group_id = aws_security_group.controller.id
@ -44,39 +68,31 @@ resource "aws_security_group_rule" "controller-etcd-metrics" {
source_security_group_id = aws_security_group.worker.id
}
# Allow Prometheus to scrape kube-proxy
resource "aws_security_group_rule" "kube-proxy-metrics" {
resource "aws_security_group_rule" "controller-cilium-health" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.controller.id
type = "ingress"
protocol = "tcp"
from_port = 10249
to_port = 10249
from_port = 4240
to_port = 4240
source_security_group_id = aws_security_group.worker.id
}
# Allow Prometheus to scrape kube-scheduler
resource "aws_security_group_rule" "controller-scheduler-metrics" {
resource "aws_security_group_rule" "controller-cilium-health-self" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.controller.id
type = "ingress"
protocol = "tcp"
from_port = 10251
to_port = 10251
source_security_group_id = aws_security_group.worker.id
}
# Allow Prometheus to scrape kube-controller-manager
resource "aws_security_group_rule" "controller-manager-metrics" {
security_group_id = aws_security_group.controller.id
type = "ingress"
protocol = "tcp"
from_port = 10252
to_port = 10252
source_security_group_id = aws_security_group.worker.id
type = "ingress"
protocol = "tcp"
from_port = 4240
to_port = 4240
self = true
}
# IANA VXLAN default
resource "aws_security_group_rule" "controller-vxlan" {
count = var.networking == "flannel" ? 1 : 0
@ -111,6 +127,31 @@ resource "aws_security_group_rule" "controller-apiserver" {
cidr_blocks = ["0.0.0.0/0"]
}
# Linux VXLAN default
resource "aws_security_group_rule" "controller-linux-vxlan" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.controller.id
type = "ingress"
protocol = "udp"
from_port = 8472
to_port = 8472
source_security_group_id = aws_security_group.worker.id
}
resource "aws_security_group_rule" "controller-linux-vxlan-self" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.controller.id
type = "ingress"
protocol = "udp"
from_port = 8472
to_port = 8472
self = true
}
# Allow Prometheus to scrape node-exporter daemonset
resource "aws_security_group_rule" "controller-node-exporter" {
security_group_id = aws_security_group.controller.id
@ -122,6 +163,17 @@ resource "aws_security_group_rule" "controller-node-exporter" {
source_security_group_id = aws_security_group.worker.id
}
# Allow Prometheus to scrape kube-proxy
resource "aws_security_group_rule" "kube-proxy-metrics" {
security_group_id = aws_security_group.controller.id
type = "ingress"
protocol = "tcp"
from_port = 10249
to_port = 10249
source_security_group_id = aws_security_group.worker.id
}
# Allow apiserver to access kubelets for exec, log, port-forward
resource "aws_security_group_rule" "controller-kubelet" {
security_group_id = aws_security_group.controller.id
@ -143,6 +195,28 @@ resource "aws_security_group_rule" "controller-kubelet-self" {
self = true
}
# Allow Prometheus to scrape kube-scheduler
resource "aws_security_group_rule" "controller-scheduler-metrics" {
security_group_id = aws_security_group.controller.id
type = "ingress"
protocol = "tcp"
from_port = 10251
to_port = 10251
source_security_group_id = aws_security_group.worker.id
}
# Allow Prometheus to scrape kube-controller-manager
resource "aws_security_group_rule" "controller-manager-metrics" {
security_group_id = aws_security_group.controller.id
type = "ingress"
protocol = "tcp"
from_port = 10252
to_port = 10252
source_security_group_id = aws_security_group.worker.id
}
resource "aws_security_group_rule" "controller-bgp" {
security_group_id = aws_security_group.controller.id
@ -227,6 +301,30 @@ resource "aws_security_group" "worker" {
}
}
resource "aws_security_group_rule" "worker-icmp" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.worker.id
type = "ingress"
protocol = "icmp"
from_port = 8
to_port = 0
source_security_group_id = aws_security_group.controller.id
}
resource "aws_security_group_rule" "worker-icmp-self" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.worker.id
type = "ingress"
protocol = "icmp"
from_port = 8
to_port = 0
self = true
}
resource "aws_security_group_rule" "worker-ssh" {
security_group_id = aws_security_group.worker.id
@ -257,6 +355,31 @@ resource "aws_security_group_rule" "worker-https" {
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "worker-cilium-health" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.worker.id
type = "ingress"
protocol = "tcp"
from_port = 4240
to_port = 4240
source_security_group_id = aws_security_group.controller.id
}
resource "aws_security_group_rule" "worker-cilium-health-self" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.worker.id
type = "ingress"
protocol = "tcp"
from_port = 4240
to_port = 4240
self = true
}
# IANA VXLAN default
resource "aws_security_group_rule" "worker-vxlan" {
count = var.networking == "flannel" ? 1 : 0
@ -281,6 +404,31 @@ resource "aws_security_group_rule" "worker-vxlan-self" {
self = true
}
# Linux VXLAN default
resource "aws_security_group_rule" "worker-linux-vxlan" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.worker.id
type = "ingress"
protocol = "udp"
from_port = 8472
to_port = 8472
source_security_group_id = aws_security_group.controller.id
}
resource "aws_security_group_rule" "worker-linux-vxlan-self" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.worker.id
type = "ingress"
protocol = "udp"
from_port = 8472
to_port = 8472
self = true
}
# Allow Prometheus to scrape node-exporter daemonset
resource "aws_security_group_rule" "worker-node-exporter" {
security_group_id = aws_security_group.worker.id

View File

@ -4,7 +4,7 @@ terraform {
required_version = "~> 0.12.6"
required_providers {
aws = "~> 2.23"
ct = "~> 0.3"
ct = "~> 0.4"
template = "~> 2.1"
null = "~> 2.1"
}

View File

@ -2,11 +2,11 @@
systemd:
units:
- name: docker.service
enable: true
enabled: true
- name: locksmithd.service
mask: true
- name: wait-for-dns.service
enable: true
enabled: true
contents: |
[Unit]
Description=Wait for DNS entries
@ -19,12 +19,13 @@ systemd:
[Install]
RequiredBy=kubelet.service
- name: kubelet.service
enable: true
enabled: true
contents: |
[Unit]
Description=Kubelet via Hyperkube
Description=Kubelet
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.6
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
@ -64,19 +65,18 @@ systemd:
--mount volume=var-log,target=/var/log \
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
--mount volume=opt-cni-bin,target=/opt/cni/bin \
docker://quay.io/poseidon/kubelet:v1.18.2 -- \
$${KUBELET_IMAGE} -- \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--healthz-port=0 \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/node \
%{~ for label in split(",", node_labels) ~}
@ -84,6 +84,7 @@ systemd:
%{~ endfor ~}
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
@ -112,6 +113,7 @@ storage:
${kubeconfig}
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
mode: 0644
contents:
inline: |
fs.inotify.max_user_watches=16184
@ -127,7 +129,7 @@ storage:
--volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \
--insecure-options=image \
docker://quay.io/poseidon/kubelet:v1.18.2 \
docker://quay.io/poseidon/kubelet:v1.18.6 \
--net=host \
--dns=host \
--exec=/usr/local/bin/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)

View File

@ -71,9 +71,9 @@ resource "aws_launch_configuration" "worker" {
# Worker Ignition config
data "ct_config" "worker-ignition" {
content = data.template_file.worker-config.rendered
pretty_print = false
snippets = var.snippets
content = data.template_file.worker-config.rendered
strict = true
snippets = var.snippets
}
# Worker Container Linux config

View File

@ -11,11 +11,11 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.18.2 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Kubernetes v1.18.6 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/cl/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
* Ready for Ingress, Prometheus, Grafana, CSI, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
## Docs

View File

@ -13,16 +13,8 @@ data "aws_ami" "fedora-coreos" {
values = ["hvm"]
}
filter {
name = "name"
values = ["fedora-coreos-31.*.*.*-hvm"]
}
filter {
name = "description"
values = ["Fedora CoreOS stable*"]
values = ["Fedora CoreOS ${var.os_stream} *"]
}
# try to filter out dev images (AWS filters can't)
name_regex = "^fedora-coreos-31.[0-9]*.[0-9]*.[0-9]*-hvm*"
}

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests)
module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=14d0b2087962a0f2557c184f3f523548ce19bbdc"
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=835890025bfb3499aa0cd5a9704e9e7d3e7e5fe5"
cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]

View File

@ -36,7 +36,7 @@ resource "aws_instance" "controllers" {
# network
associate_public_ip_address = true
subnet_id = aws_subnet.public.*.id[count.index]
subnet_id = element(aws_subnet.public.*.id, count.index)
vpc_security_group_ids = [aws_security_group.controller.id]
lifecycle {

View File

@ -28,7 +28,7 @@ systemd:
--network host \
--volume /var/lib/etcd:/var/lib/etcd:rw,Z \
--volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \
quay.io/coreos/etcd:v3.4.7
quay.io/coreos/etcd:v3.4.9
ExecStop=/usr/bin/podman stop etcd
[Install]
WantedBy=multi-user.target
@ -38,11 +38,12 @@ systemd:
enabled: true
contents: |
[Unit]
Description=Wait for DNS entries
Description=Wait for DNS and hostname
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStartPre=/bin/sh -c 'while [ `hostname -s` == "localhost" ]; do sleep 1; done;'
ExecStart=/bin/sh -c 'while ! /usr/bin/grep '^[^#[:space:]]' /etc/resolv.conf > /dev/null; do sleep 1; done'
[Install]
RequiredBy=kubelet.service
@ -51,9 +52,10 @@ systemd:
enabled: true
contents: |
[Unit]
Description=Kubelet via Hyperkube (System Container)
Description=Kubelet (System Container)
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.6
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
@ -79,10 +81,11 @@ systemd:
--volume /var/log:/var/log \
--volume /var/run/lock:/var/run/lock:z \
--volume /opt/cni/bin:/opt/cni/bin:z \
quay.io/poseidon/kubelet:v1.18.2 \
$${KUBELET_IMAGE} \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=systemd \
--cgroups-per-qos=true \
--enforce-node-allocatable=pods \
@ -90,16 +93,14 @@ systemd:
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--healthz-port=0 \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/master \
--node-labels=node.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
--register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/podman stop kubelet
Delegate=yes
@ -123,7 +124,7 @@ systemd:
--volume /opt/bootstrap/assets:/assets:ro,Z \
--volume /opt/bootstrap/apply:/apply:ro,Z \
--entrypoint=/apply \
quay.io/poseidon/kubelet:v1.18.2
quay.io/poseidon/kubelet:v1.18.6
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
ExecStartPost=-/usr/bin/podman stop bootstrap
storage:
@ -151,11 +152,11 @@ storage:
chmod -R 500 /etc/ssl/etcd
mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/
mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/
sudo mkdir -p /etc/kubernetes/manifests
sudo mv static-manifests/* /etc/kubernetes/manifests/
sudo mkdir -p /opt/bootstrap/assets
sudo mv manifests /opt/bootstrap/assets/manifests
sudo mv manifests-networking/* /opt/bootstrap/assets/manifests/
mkdir -p /etc/kubernetes/manifests
mv static-manifests/* /etc/kubernetes/manifests/
mkdir -p /opt/bootstrap/assets
mv manifests /opt/bootstrap/assets/manifests
mv manifests-networking/* /opt/bootstrap/assets/manifests/
rm -rf assets auth static-manifests tls manifests-networking
- path: /opt/bootstrap/apply
mode: 0544
@ -175,6 +176,11 @@ storage:
contents:
inline: |
fs.inotify.max_user_watches=16184
- path: /etc/sysctl.d/reverse-path-filter.conf
contents:
inline: |
net.ipv4.conf.default.rp_filter=0
net.ipv4.conf.*.rp_filter=0
- path: /etc/systemd/system.conf.d/accounting.conf
contents:
inline: |

View File

@ -13,6 +13,30 @@ resource "aws_security_group" "controller" {
}
}
resource "aws_security_group_rule" "controller-icmp" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.controller.id
type = "ingress"
protocol = "icmp"
from_port = 8
to_port = 0
source_security_group_id = aws_security_group.worker.id
}
resource "aws_security_group_rule" "controller-icmp-self" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.controller.id
type = "ingress"
protocol = "icmp"
from_port = 8
to_port = 0
self = true
}
resource "aws_security_group_rule" "controller-ssh" {
security_group_id = aws_security_group.controller.id
@ -44,39 +68,31 @@ resource "aws_security_group_rule" "controller-etcd-metrics" {
source_security_group_id = aws_security_group.worker.id
}
# Allow Prometheus to scrape kube-proxy
resource "aws_security_group_rule" "kube-proxy-metrics" {
resource "aws_security_group_rule" "controller-cilium-health" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.controller.id
type = "ingress"
protocol = "tcp"
from_port = 10249
to_port = 10249
from_port = 4240
to_port = 4240
source_security_group_id = aws_security_group.worker.id
}
# Allow Prometheus to scrape kube-scheduler
resource "aws_security_group_rule" "controller-scheduler-metrics" {
resource "aws_security_group_rule" "controller-cilium-health-self" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.controller.id
type = "ingress"
protocol = "tcp"
from_port = 10251
to_port = 10251
source_security_group_id = aws_security_group.worker.id
}
# Allow Prometheus to scrape kube-controller-manager
resource "aws_security_group_rule" "controller-manager-metrics" {
security_group_id = aws_security_group.controller.id
type = "ingress"
protocol = "tcp"
from_port = 10252
to_port = 10252
source_security_group_id = aws_security_group.worker.id
type = "ingress"
protocol = "tcp"
from_port = 4240
to_port = 4240
self = true
}
# IANA VXLAN default
resource "aws_security_group_rule" "controller-vxlan" {
count = var.networking == "flannel" ? 1 : 0
@ -111,6 +127,31 @@ resource "aws_security_group_rule" "controller-apiserver" {
cidr_blocks = ["0.0.0.0/0"]
}
# Linux VXLAN default
resource "aws_security_group_rule" "controller-linux-vxlan" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.controller.id
type = "ingress"
protocol = "udp"
from_port = 8472
to_port = 8472
source_security_group_id = aws_security_group.worker.id
}
resource "aws_security_group_rule" "controller-linux-vxlan-self" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.controller.id
type = "ingress"
protocol = "udp"
from_port = 8472
to_port = 8472
self = true
}
# Allow Prometheus to scrape node-exporter daemonset
resource "aws_security_group_rule" "controller-node-exporter" {
security_group_id = aws_security_group.controller.id
@ -122,6 +163,17 @@ resource "aws_security_group_rule" "controller-node-exporter" {
source_security_group_id = aws_security_group.worker.id
}
# Allow Prometheus to scrape kube-proxy
resource "aws_security_group_rule" "kube-proxy-metrics" {
security_group_id = aws_security_group.controller.id
type = "ingress"
protocol = "tcp"
from_port = 10249
to_port = 10249
source_security_group_id = aws_security_group.worker.id
}
# Allow apiserver to access kubelets for exec, log, port-forward
resource "aws_security_group_rule" "controller-kubelet" {
security_group_id = aws_security_group.controller.id
@ -143,6 +195,28 @@ resource "aws_security_group_rule" "controller-kubelet-self" {
self = true
}
# Allow Prometheus to scrape kube-scheduler
resource "aws_security_group_rule" "controller-scheduler-metrics" {
security_group_id = aws_security_group.controller.id
type = "ingress"
protocol = "tcp"
from_port = 10251
to_port = 10251
source_security_group_id = aws_security_group.worker.id
}
# Allow Prometheus to scrape kube-controller-manager
resource "aws_security_group_rule" "controller-manager-metrics" {
security_group_id = aws_security_group.controller.id
type = "ingress"
protocol = "tcp"
from_port = 10252
to_port = 10252
source_security_group_id = aws_security_group.worker.id
}
resource "aws_security_group_rule" "controller-bgp" {
security_group_id = aws_security_group.controller.id
@ -227,6 +301,30 @@ resource "aws_security_group" "worker" {
}
}
resource "aws_security_group_rule" "worker-icmp" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.worker.id
type = "ingress"
protocol = "icmp"
from_port = 8
to_port = 0
source_security_group_id = aws_security_group.controller.id
}
resource "aws_security_group_rule" "worker-icmp-self" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.worker.id
type = "ingress"
protocol = "icmp"
from_port = 8
to_port = 0
self = true
}
resource "aws_security_group_rule" "worker-ssh" {
security_group_id = aws_security_group.worker.id
@ -257,6 +355,31 @@ resource "aws_security_group_rule" "worker-https" {
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "worker-cilium-health" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.worker.id
type = "ingress"
protocol = "tcp"
from_port = 4240
to_port = 4240
source_security_group_id = aws_security_group.controller.id
}
resource "aws_security_group_rule" "worker-cilium-health-self" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.worker.id
type = "ingress"
protocol = "tcp"
from_port = 4240
to_port = 4240
self = true
}
# IANA VXLAN default
resource "aws_security_group_rule" "worker-vxlan" {
count = var.networking == "flannel" ? 1 : 0
@ -281,6 +404,31 @@ resource "aws_security_group_rule" "worker-vxlan-self" {
self = true
}
# Linux VXLAN default
resource "aws_security_group_rule" "worker-linux-vxlan" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.worker.id
type = "ingress"
protocol = "udp"
from_port = 8472
to_port = 8472
source_security_group_id = aws_security_group.controller.id
}
resource "aws_security_group_rule" "worker-linux-vxlan-self" {
count = var.networking == "cilium" ? 1 : 0
security_group_id = aws_security_group.worker.id
type = "ingress"
protocol = "udp"
from_port = 8472
to_port = 8472
self = true
}
# Allow Prometheus to scrape node-exporter daemonset
resource "aws_security_group_rule" "worker-node-exporter" {
security_group_id = aws_security_group.worker.id

View File

@ -41,9 +41,9 @@ variable "worker_type" {
default = "t3.small"
}
variable "os_image" {
variable "os_stream" {
type = string
description = "AMI channel for Fedora CoreOS (not yet used)"
description = "Fedora CoreOs image stream for instances (e.g. stable, testing, next)"
default = "stable"
}

View File

@ -8,7 +8,7 @@ module "workers" {
security_groups = [aws_security_group.worker.id]
worker_count = var.worker_count
instance_type = var.worker_type
os_image = var.os_image
os_stream = var.os_stream
disk_size = var.disk_size
spot_price = var.worker_price
target_groups = var.worker_target_groups

View File

@ -13,16 +13,8 @@ data "aws_ami" "fedora-coreos" {
values = ["hvm"]
}
filter {
name = "name"
values = ["fedora-coreos-31.*.*.*-hvm"]
}
filter {
name = "description"
values = ["Fedora CoreOS stable*"]
values = ["Fedora CoreOS ${var.os_stream} *"]
}
# try to filter out dev images (AWS filters can't)
name_regex = "^fedora-coreos-31.[0-9]*.[0-9]*.[0-9]*-hvm*"
}

View File

@ -9,11 +9,12 @@ systemd:
enabled: true
contents: |
[Unit]
Description=Wait for DNS entries
Description=Wait for DNS and hostname
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStartPre=/bin/sh -c 'while [ `hostname -s` == "localhost" ]; do sleep 1; done;'
ExecStart=/bin/sh -c 'while ! /usr/bin/grep '^[^#[:space:]]' /etc/resolv.conf > /dev/null; do sleep 1; done'
[Install]
RequiredBy=kubelet.service
@ -21,9 +22,10 @@ systemd:
enabled: true
contents: |
[Unit]
Description=Kubelet via Hyperkube (System Container)
Description=Kubelet (System Container)
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.6
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
@ -49,10 +51,11 @@ systemd:
--volume /var/log:/var/log \
--volume /var/run/lock:/var/run/lock:z \
--volume /opt/cni/bin:/opt/cni/bin:z \
quay.io/poseidon/kubelet:v1.18.2 \
$${KUBELET_IMAGE} \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=systemd \
--cgroups-per-qos=true \
--enforce-node-allocatable=pods \
@ -60,10 +63,8 @@ systemd:
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--healthz-port=0 \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/node \
%{~ for label in split(",", node_labels) ~}
@ -71,6 +72,7 @@ systemd:
%{~ endfor ~}
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/podman stop kubelet
Delegate=yes
@ -87,7 +89,7 @@ systemd:
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/true
ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.18.2 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME'
ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.18.6 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME'
[Install]
WantedBy=multi-user.target
storage:
@ -103,6 +105,11 @@ storage:
contents:
inline: |
fs.inotify.max_user_watches=16184
- path: /etc/sysctl.d/reverse-path-filter.conf
contents:
inline: |
net.ipv4.conf.default.rp_filter=0
net.ipv4.conf.*.rp_filter=0
- path: /etc/systemd/system.conf.d/accounting.conf
contents:
inline: |

View File

@ -34,9 +34,9 @@ variable "instance_type" {
default = "t3.small"
}
variable "os_image" {
variable "os_stream" {
type = string
description = "AMI channel for Fedora CoreOS (not yet used)"
description = "Fedora CoreOs image stream for instances (e.g. stable, testing, next)"
default = "stable"
}

View File

@ -11,8 +11,8 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.18.2 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* Kubernetes v1.18.6 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [low-priority](https://typhoon.psdn.io/cl/azure/#low-priority) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests)
module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=14d0b2087962a0f2557c184f3f523548ce19bbdc"
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=835890025bfb3499aa0cd5a9704e9e7d3e7e5fe5"
cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]

View File

@ -2,12 +2,12 @@
systemd:
units:
- name: etcd-member.service
enable: true
enabled: true
dropins:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.4.7"
Environment="ETCD_IMAGE_TAG=v3.4.9"
Environment="ETCD_IMAGE_URL=docker://quay.io/coreos/etcd"
Environment="RKT_RUN_ARGS=--insecure-options=image"
Environment="ETCD_NAME=${etcd_name}"
@ -28,11 +28,11 @@ systemd:
Environment="ETCD_PEER_KEY_FILE=/etc/ssl/certs/etcd/peer.key"
Environment="ETCD_PEER_CLIENT_CERT_AUTH=true"
- name: docker.service
enable: true
enabled: true
- name: locksmithd.service
mask: true
- name: wait-for-dns.service
enable: true
enabled: true
contents: |
[Unit]
Description=Wait for DNS entries
@ -46,12 +46,14 @@ systemd:
RequiredBy=kubelet.service
RequiredBy=etcd-member.service
- name: kubelet.service
enable: true
enabled: true
contents: |
[Unit]
Description=Kubelet via Hyperkube
Description=Kubelet
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.6
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
@ -90,24 +92,24 @@ systemd:
--mount volume=var-log,target=/var/log \
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
--mount volume=opt-cni-bin,target=/opt/cni/bin \
docker://quay.io/poseidon/kubelet:v1.18.2 -- \
$${KUBELET_IMAGE} -- \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--healthz-port=0 \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/master \
--node-labels=node.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
--register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
@ -132,7 +134,7 @@ systemd:
--volume script,kind=host,source=/opt/bootstrap/apply \
--mount volume=script,target=/apply \
--insecure-options=image \
docker://quay.io/poseidon/kubelet:v1.18.2 \
docker://quay.io/poseidon/kubelet:v1.18.6 \
--net=host \
--dns=host \
--exec=/apply
@ -163,11 +165,11 @@ storage:
chmod -R 500 /etc/ssl/etcd
mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/
mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/
sudo mkdir -p /etc/kubernetes/manifests
sudo mv static-manifests/* /etc/kubernetes/manifests/
sudo mkdir -p /opt/bootstrap/assets
sudo mv manifests /opt/bootstrap/assets/manifests
sudo mv manifests-networking/* /opt/bootstrap/assets/manifests/
mkdir -p /etc/kubernetes/manifests
mv static-manifests/* /etc/kubernetes/manifests/
mkdir -p /opt/bootstrap/assets
mv manifests /opt/bootstrap/assets/manifests
mv manifests-networking/* /opt/bootstrap/assets/manifests/
rm -rf assets auth static-manifests tls manifests-networking
- path: /opt/bootstrap/apply
filesystem: root
@ -186,6 +188,7 @@ storage:
done
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
mode: 0644
contents:
inline: |
fs.inotify.max_user_watches=16184

View File

@ -53,18 +53,24 @@ resource "azurerm_linux_virtual_machine" "controllers" {
storage_account_type = "Premium_LRS"
}
// CoreOS Container Linux or Flatcar Container Linux (manual upload)
dynamic "source_image_reference" {
for_each = local.flavor == "coreos" ? [1] : []
# CoreOS Container Linux or Flatcar Container Linux
source_image_reference {
publisher = local.flavor == "flatcar" ? "Kinvolk" : "CoreOS"
offer = local.flavor == "flatcar" ? "flatcar-container-linux-free" : "CoreOS"
sku = local.channel
version = "latest"
}
# Gross hack for Flatcar Linux
dynamic "plan" {
for_each = local.flavor == "flatcar" ? [1] : []
content {
publisher = "CoreOS"
offer = "CoreOS"
sku = local.channel
version = "latest"
name = local.channel
publisher = "kinvolk"
product = "flatcar-container-linux-free"
}
}
source_image_id = local.flavor == "coreos" ? null : var.os_image
# network
network_interface_ids = [
@ -133,10 +139,10 @@ resource "azurerm_network_interface_backend_address_pool_association" "controlle
# Controller Ignition configs
data "ct_config" "controller-ignitions" {
count = var.controller_count
content = data.template_file.controller-configs.*.rendered[count.index]
pretty_print = false
snippets = var.controller_snippets
count = var.controller_count
content = data.template_file.controller-configs.*.rendered[count.index]
strict = true
snippets = var.controller_snippets
}
# Controller Container Linux configs
@ -151,6 +157,7 @@ data "template_file" "controller-configs" {
etcd_domain = "${var.cluster_name}-etcd${count.index}.${var.dns_zone}"
# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...
etcd_initial_cluster = join(",", data.template_file.etcds.*.rendered)
cgroup_driver = local.flavor == "flatcar" && local.channel == "edge" ? "systemd" : "cgroupfs"
kubeconfig = indent(10, module.bootstrap.kubeconfig-kubelet)
ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)

View File

@ -21,7 +21,7 @@ resource "azurerm_subnet" "controller" {
name = "controller"
virtual_network_name = azurerm_virtual_network.network.name
address_prefix = cidrsubnet(var.host_cidr, 1, 0)
address_prefixes = [cidrsubnet(var.host_cidr, 1, 0)]
}
resource "azurerm_subnet_network_security_group_association" "controller" {
@ -34,7 +34,7 @@ resource "azurerm_subnet" "worker" {
name = "worker"
virtual_network_name = azurerm_virtual_network.network.name
address_prefix = cidrsubnet(var.host_cidr, 1, 1)
address_prefixes = [cidrsubnet(var.host_cidr, 1, 1)]
}
resource "azurerm_subnet_network_security_group_association" "worker" {

View File

@ -7,6 +7,21 @@ resource "azurerm_network_security_group" "controller" {
location = azurerm_resource_group.cluster.location
}
resource "azurerm_network_security_rule" "controller-icmp" {
resource_group_name = azurerm_resource_group.cluster.name
name = "allow-icmp"
network_security_group_name = azurerm_network_security_group.controller.name
priority = "1995"
access = "Allow"
direction = "Inbound"
protocol = "Icmp"
source_port_range = "*"
destination_port_range = "*"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.controller.address_prefix
}
resource "azurerm_network_security_rule" "controller-ssh" {
resource_group_name = azurerm_resource_group.cluster.name
@ -100,6 +115,22 @@ resource "azurerm_network_security_rule" "controller-apiserver" {
destination_address_prefix = azurerm_subnet.controller.address_prefix
}
resource "azurerm_network_security_rule" "controller-cilium-health" {
resource_group_name = azurerm_resource_group.cluster.name
count = var.networking == "cilium" ? 1 : 0
name = "allow-cilium-health"
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2019"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "4240"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.controller.address_prefix
}
resource "azurerm_network_security_rule" "controller-vxlan" {
resource_group_name = azurerm_resource_group.cluster.name
@ -115,6 +146,21 @@ resource "azurerm_network_security_rule" "controller-vxlan" {
destination_address_prefix = azurerm_subnet.controller.address_prefix
}
resource "azurerm_network_security_rule" "controller-linux-vxlan" {
resource_group_name = azurerm_resource_group.cluster.name
name = "allow-linux-vxlan"
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2021"
access = "Allow"
direction = "Inbound"
protocol = "Udp"
source_port_range = "*"
destination_port_range = "8472"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.controller.address_prefix
}
# Allow Prometheus to scrape node-exporter daemonset
resource "azurerm_network_security_rule" "controller-node-exporter" {
resource_group_name = azurerm_resource_group.cluster.name
@ -191,6 +237,21 @@ resource "azurerm_network_security_group" "worker" {
location = azurerm_resource_group.cluster.location
}
resource "azurerm_network_security_rule" "worker-icmp" {
resource_group_name = azurerm_resource_group.cluster.name
name = "allow-icmp"
network_security_group_name = azurerm_network_security_group.worker.name
priority = "1995"
access = "Allow"
direction = "Inbound"
protocol = "Icmp"
source_port_range = "*"
destination_port_range = "*"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.worker.address_prefix
}
resource "azurerm_network_security_rule" "worker-ssh" {
resource_group_name = azurerm_resource_group.cluster.name
@ -236,6 +297,22 @@ resource "azurerm_network_security_rule" "worker-https" {
destination_address_prefix = azurerm_subnet.worker.address_prefix
}
resource "azurerm_network_security_rule" "worker-cilium-health" {
resource_group_name = azurerm_resource_group.cluster.name
count = var.networking == "cilium" ? 1 : 0
name = "allow-cilium-health"
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2014"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "4240"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.worker.address_prefix
}
resource "azurerm_network_security_rule" "worker-vxlan" {
resource_group_name = azurerm_resource_group.cluster.name
@ -251,6 +328,21 @@ resource "azurerm_network_security_rule" "worker-vxlan" {
destination_address_prefix = azurerm_subnet.worker.address_prefix
}
resource "azurerm_network_security_rule" "worker-linux-vxlan" {
resource_group_name = azurerm_resource_group.cluster.name
name = "allow-linux-vxlan"
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2016"
access = "Allow"
direction = "Inbound"
protocol = "Udp"
source_port_range = "*"
destination_port_range = "8472"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.worker.address_prefix
}
# Allow Prometheus to scrape node-exporter daemonset
resource "azurerm_network_security_rule" "worker-node-exporter" {
resource_group_name = azurerm_resource_group.cluster.name

View File

@ -48,7 +48,8 @@ variable "worker_type" {
variable "os_image" {
type = string
description = "Channel for a Container Linux derivative (/subscriptions/some-flatcar-upload, coreos-stable, coreos-beta, coreos-alpha)"
description = "Channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge, coreos-stable, coreos-beta, coreos-alpha)"
default = "flatcar-stable"
}
variable "disk_size" {

View File

@ -3,8 +3,8 @@
terraform {
required_version = "~> 0.12.6"
required_providers {
azurerm = "~> 2.0"
ct = "~> 0.3"
azurerm = "~> 2.8"
ct = "~> 0.4"
template = "~> 2.1"
null = "~> 2.1"
}

View File

@ -2,11 +2,11 @@
systemd:
units:
- name: docker.service
enable: true
enabled: true
- name: locksmithd.service
mask: true
- name: wait-for-dns.service
enable: true
enabled: true
contents: |
[Unit]
Description=Wait for DNS entries
@ -19,12 +19,14 @@ systemd:
[Install]
RequiredBy=kubelet.service
- name: kubelet.service
enable: true
enabled: true
contents: |
[Unit]
Description=Kubelet via Hyperkube
Description=Kubelet
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.6
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
@ -63,18 +65,18 @@ systemd:
--mount volume=var-log,target=/var/log \
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
--mount volume=opt-cni-bin,target=/opt/cni/bin \
docker://quay.io/poseidon/kubelet:v1.18.2 -- \
$${KUBELET_IMAGE} -- \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--healthz-port=0 \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/node \
%{~ for label in split(",", node_labels) ~}
@ -82,6 +84,7 @@ systemd:
%{~ endfor ~}
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
@ -89,7 +92,7 @@ systemd:
[Install]
WantedBy=multi-user.target
- name: delete-node.service
enable: true
enabled: true
contents: |
[Unit]
Description=Waiting to delete Kubernetes node on shutdown
@ -110,6 +113,7 @@ storage:
${kubeconfig}
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
mode: 0644
contents:
inline: |
fs.inotify.max_user_watches=16184
@ -125,7 +129,7 @@ storage:
--volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \
--insecure-options=image \
docker://quay.io/poseidon/kubelet:v1.18.2 \
docker://quay.io/poseidon/kubelet:v1.18.6 \
--net=host \
--dns=host \
--exec=/usr/local/bin/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname | tr '[:upper:]' '[:lower:]')

View File

@ -46,7 +46,7 @@ variable "vm_type" {
variable "os_image" {
type = string
description = "Channel for a Container Linux derivative (flatcar-stable, flatcar-beta, coreos-stable, coreos-beta, coreos-alpha)"
description = "Channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge, coreos-stable, coreos-beta, coreos-alpha)"
default = "flatcar-stable"
}

View File

@ -24,18 +24,24 @@ resource "azurerm_linux_virtual_machine_scale_set" "workers" {
caching = "ReadWrite"
}
// CoreOS Container Linux or Flatcar Container Linux (manual upload)
dynamic "source_image_reference" {
for_each = local.flavor == "coreos" ? [1] : []
# CoreOS Container Linux or Flatcar Container Linux
source_image_reference {
publisher = local.flavor == "flatcar" ? "Kinvolk" : "CoreOS"
offer = local.flavor == "flatcar" ? "flatcar-container-linux-free" : "CoreOS"
sku = local.channel
version = "latest"
}
# Gross hack for Flatcar Linux
dynamic "plan" {
for_each = local.flavor == "flatcar" ? [1] : []
content {
publisher = "CoreOS"
offer = "CoreOS"
sku = local.channel
version = "latest"
name = local.channel
publisher = "kinvolk"
product = "flatcar-container-linux-free"
}
}
source_image_id = local.flavor == "coreos" ? null : var.os_image
# Azure requires setting admin_ssh_key, though Ignition custom_data handles it too
admin_username = "core"
@ -91,9 +97,9 @@ resource "azurerm_monitor_autoscale_setting" "workers" {
# Worker Ignition configs
data "ct_config" "worker-ignition" {
content = data.template_file.worker-config.rendered
pretty_print = false
snippets = var.snippets
content = data.template_file.worker-config.rendered
strict = true
snippets = var.snippets
}
# Worker Container Linux configs
@ -105,6 +111,7 @@ data "template_file" "worker-config" {
ssh_authorized_key = var.ssh_authorized_key
cluster_dns_service_ip = cidrhost(var.service_cidr, 10)
cluster_domain_suffix = var.cluster_domain_suffix
cgroup_driver = local.flavor == "flatcar" && local.channel == "edge" ? "systemd" : "cgroupfs"
node_labels = join(",", var.node_labels)
}
}

View File

@ -11,9 +11,9 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.18.2 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Kubernetes v1.18.6 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot priority](https://typhoon.psdn.io/fedora-coreos/azure/#low-priority) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/) customization
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests)
module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=14d0b2087962a0f2557c184f3f523548ce19bbdc"
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=835890025bfb3499aa0cd5a9704e9e7d3e7e5fe5"
cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]
@ -10,8 +10,9 @@ module "bootstrap" {
networking = var.networking
# only effective with Calico networking
# we should be able to use 1450 MTU, but in practice, 1410 was needed
network_encapsulation = "vxlan"
network_mtu = "1450"
network_mtu = "1410"
pod_cidr = var.pod_cidr
service_cidr = var.service_cidr

View File

@ -113,10 +113,10 @@ resource "azurerm_network_interface_backend_address_pool_association" "controlle
# Controller Ignition configs
data "ct_config" "controller-ignitions" {
count = var.controller_count
content = data.template_file.controller-configs.*.rendered[count.index]
pretty_print = false
snippets = var.controller_snippets
count = var.controller_count
content = data.template_file.controller-configs.*.rendered[count.index]
strict = true
snippets = var.controller_snippets
}
# Controller Fedora CoreOS configs

View File

@ -28,7 +28,7 @@ systemd:
--network host \
--volume /var/lib/etcd:/var/lib/etcd:rw,Z \
--volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \
quay.io/coreos/etcd:v3.4.7
quay.io/coreos/etcd:v3.4.9
ExecStop=/usr/bin/podman stop etcd
[Install]
WantedBy=multi-user.target
@ -51,9 +51,10 @@ systemd:
enabled: true
contents: |
[Unit]
Description=Kubelet via Hyperkube (System Container)
Description=Kubelet (System Container)
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.6
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
@ -79,10 +80,11 @@ systemd:
--volume /var/log:/var/log \
--volume /var/run/lock:/var/run/lock:z \
--volume /opt/cni/bin:/opt/cni/bin:z \
quay.io/poseidon/kubelet:v1.18.2 \
$${KUBELET_IMAGE} \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=systemd \
--cgroups-per-qos=true \
--enforce-node-allocatable=pods \
@ -90,16 +92,14 @@ systemd:
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--healthz-port=0 \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/master \
--node-labels=node.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
--register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/podman stop kubelet
Delegate=yes
@ -123,7 +123,7 @@ systemd:
--volume /opt/bootstrap/assets:/assets:ro,Z \
--volume /opt/bootstrap/apply:/apply:ro,Z \
--entrypoint=/apply \
quay.io/poseidon/kubelet:v1.18.2
quay.io/poseidon/kubelet:v1.18.6
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
ExecStartPost=-/usr/bin/podman stop bootstrap
storage:
@ -151,11 +151,11 @@ storage:
chmod -R 500 /etc/ssl/etcd
mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/
mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/
sudo mkdir -p /etc/kubernetes/manifests
sudo mv static-manifests/* /etc/kubernetes/manifests/
sudo mkdir -p /opt/bootstrap/assets
sudo mv manifests /opt/bootstrap/assets/manifests
sudo mv manifests-networking/* /opt/bootstrap/assets/manifests/
mkdir -p /etc/kubernetes/manifests
mv static-manifests/* /etc/kubernetes/manifests/
mkdir -p /opt/bootstrap/assets
mv manifests /opt/bootstrap/assets/manifests
mv manifests-networking/* /opt/bootstrap/assets/manifests/
rm -rf assets auth static-manifests tls manifests-networking
- path: /opt/bootstrap/apply
mode: 0544
@ -175,6 +175,11 @@ storage:
contents:
inline: |
fs.inotify.max_user_watches=16184
- path: /etc/sysctl.d/reverse-path-filter.conf
contents:
inline: |
net.ipv4.conf.default.rp_filter=0
net.ipv4.conf.*.rp_filter=0
- path: /etc/systemd/system.conf.d/accounting.conf
contents:
inline: |

View File

@ -21,7 +21,7 @@ resource "azurerm_subnet" "controller" {
name = "controller"
virtual_network_name = azurerm_virtual_network.network.name
address_prefix = cidrsubnet(var.host_cidr, 1, 0)
address_prefixes = [cidrsubnet(var.host_cidr, 1, 0)]
}
resource "azurerm_subnet_network_security_group_association" "controller" {
@ -34,7 +34,7 @@ resource "azurerm_subnet" "worker" {
name = "worker"
virtual_network_name = azurerm_virtual_network.network.name
address_prefix = cidrsubnet(var.host_cidr, 1, 1)
address_prefixes = [cidrsubnet(var.host_cidr, 1, 1)]
}
resource "azurerm_subnet_network_security_group_association" "worker" {

View File

@ -7,6 +7,21 @@ resource "azurerm_network_security_group" "controller" {
location = azurerm_resource_group.cluster.location
}
resource "azurerm_network_security_rule" "controller-icmp" {
resource_group_name = azurerm_resource_group.cluster.name
name = "allow-icmp"
network_security_group_name = azurerm_network_security_group.controller.name
priority = "1995"
access = "Allow"
direction = "Inbound"
protocol = "Icmp"
source_port_range = "*"
destination_port_range = "*"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.controller.address_prefix
}
resource "azurerm_network_security_rule" "controller-ssh" {
resource_group_name = azurerm_resource_group.cluster.name
@ -100,6 +115,22 @@ resource "azurerm_network_security_rule" "controller-apiserver" {
destination_address_prefix = azurerm_subnet.controller.address_prefix
}
resource "azurerm_network_security_rule" "controller-cilium-health" {
resource_group_name = azurerm_resource_group.cluster.name
count = var.networking == "cilium" ? 1 : 0
name = "allow-cilium-health"
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2019"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "4240"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.controller.address_prefix
}
resource "azurerm_network_security_rule" "controller-vxlan" {
resource_group_name = azurerm_resource_group.cluster.name
@ -115,6 +146,21 @@ resource "azurerm_network_security_rule" "controller-vxlan" {
destination_address_prefix = azurerm_subnet.controller.address_prefix
}
resource "azurerm_network_security_rule" "controller-linux-vxlan" {
resource_group_name = azurerm_resource_group.cluster.name
name = "allow-linux-vxlan"
network_security_group_name = azurerm_network_security_group.controller.name
priority = "2021"
access = "Allow"
direction = "Inbound"
protocol = "Udp"
source_port_range = "*"
destination_port_range = "8472"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.controller.address_prefix
}
# Allow Prometheus to scrape node-exporter daemonset
resource "azurerm_network_security_rule" "controller-node-exporter" {
resource_group_name = azurerm_resource_group.cluster.name
@ -191,6 +237,21 @@ resource "azurerm_network_security_group" "worker" {
location = azurerm_resource_group.cluster.location
}
resource "azurerm_network_security_rule" "worker-icmp" {
resource_group_name = azurerm_resource_group.cluster.name
name = "allow-icmp"
network_security_group_name = azurerm_network_security_group.worker.name
priority = "1995"
access = "Allow"
direction = "Inbound"
protocol = "Icmp"
source_port_range = "*"
destination_port_range = "*"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.worker.address_prefix
}
resource "azurerm_network_security_rule" "worker-ssh" {
resource_group_name = azurerm_resource_group.cluster.name
@ -236,6 +297,22 @@ resource "azurerm_network_security_rule" "worker-https" {
destination_address_prefix = azurerm_subnet.worker.address_prefix
}
resource "azurerm_network_security_rule" "worker-cilium-health" {
resource_group_name = azurerm_resource_group.cluster.name
count = var.networking == "cilium" ? 1 : 0
name = "allow-cilium-health"
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2014"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "4240"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.worker.address_prefix
}
resource "azurerm_network_security_rule" "worker-vxlan" {
resource_group_name = azurerm_resource_group.cluster.name
@ -251,6 +328,21 @@ resource "azurerm_network_security_rule" "worker-vxlan" {
destination_address_prefix = azurerm_subnet.worker.address_prefix
}
resource "azurerm_network_security_rule" "worker-linux-vxlan" {
resource_group_name = azurerm_resource_group.cluster.name
name = "allow-linux-vxlan"
network_security_group_name = azurerm_network_security_group.worker.name
priority = "2016"
access = "Allow"
direction = "Inbound"
protocol = "Udp"
source_port_range = "*"
destination_port_range = "8472"
source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix]
destination_address_prefix = azurerm_subnet.worker.address_prefix
}
# Allow Prometheus to scrape node-exporter daemonset
resource "azurerm_network_security_rule" "worker-node-exporter" {
resource_group_name = azurerm_resource_group.cluster.name

View File

@ -3,8 +3,8 @@
terraform {
required_version = "~> 0.12.6"
required_providers {
azurerm = "~> 2.0"
ct = "~> 0.3"
azurerm = "~> 2.8"
ct = "~> 0.4"
template = "~> 2.1"
null = "~> 2.1"
}

View File

@ -21,9 +21,10 @@ systemd:
enabled: true
contents: |
[Unit]
Description=Kubelet via Hyperkube (System Container)
Description=Kubelet (System Container)
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.6
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
@ -49,10 +50,11 @@ systemd:
--volume /var/log:/var/log \
--volume /var/run/lock:/var/run/lock:z \
--volume /opt/cni/bin:/opt/cni/bin:z \
quay.io/poseidon/kubelet:v1.18.2 \
$${KUBELET_IMAGE} \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=systemd \
--cgroups-per-qos=true \
--enforce-node-allocatable=pods \
@ -60,10 +62,8 @@ systemd:
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--healthz-port=0 \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/node \
%{~ for label in split(",", node_labels) ~}
@ -71,6 +71,7 @@ systemd:
%{~ endfor ~}
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/podman stop kubelet
Delegate=yes
@ -87,7 +88,7 @@ systemd:
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/true
ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.18.2 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME'
ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.18.6 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME'
[Install]
WantedBy=multi-user.target
storage:
@ -103,6 +104,11 @@ storage:
contents:
inline: |
fs.inotify.max_user_watches=16184
- path: /etc/sysctl.d/reverse-path-filter.conf
contents:
inline: |
net.ipv4.conf.default.rp_filter=0
net.ipv4.conf.*.rp_filter=0
- path: /etc/systemd/system.conf.d/accounting.conf
contents:
inline: |

View File

@ -72,9 +72,9 @@ resource "azurerm_monitor_autoscale_setting" "workers" {
# Worker Ignition configs
data "ct_config" "worker-ignition" {
content = data.template_file.worker-config.rendered
pretty_print = false
snippets = var.snippets
content = data.template_file.worker-config.rendered
strict = true
snippets = var.snippets
}
# Worker Fedora CoreOS configs

View File

@ -11,8 +11,8 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.18.2 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* Kubernetes v1.18.6 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests)
module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=14d0b2087962a0f2557c184f3f523548ce19bbdc"
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=835890025bfb3499aa0cd5a9704e9e7d3e7e5fe5"
cluster_name = var.cluster_name
api_servers = [var.k8s_domain_name]

View File

@ -2,12 +2,12 @@
systemd:
units:
- name: etcd-member.service
enable: true
enabled: true
dropins:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.4.7"
Environment="ETCD_IMAGE_TAG=v3.4.9"
Environment="ETCD_IMAGE_URL=docker://quay.io/coreos/etcd"
Environment="RKT_RUN_ARGS=--insecure-options=image"
Environment="ETCD_NAME=${etcd_name}"
@ -28,11 +28,11 @@ systemd:
Environment="ETCD_PEER_KEY_FILE=/etc/ssl/certs/etcd/peer.key"
Environment="ETCD_PEER_CLIENT_CERT_AUTH=true"
- name: docker.service
enable: true
enabled: true
- name: locksmithd.service
mask: true
- name: kubelet.path
enable: true
enabled: true
contents: |
[Unit]
Description=Watch for kubeconfig
@ -41,7 +41,7 @@ systemd:
[Install]
WantedBy=multi-user.target
- name: wait-for-dns.service
enable: true
enabled: true
contents: |
[Unit]
Description=Wait for DNS entries
@ -57,9 +57,10 @@ systemd:
- name: kubelet.service
contents: |
[Unit]
Description=Kubelet via Hyperkube
Description=Kubelet
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.6
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
@ -103,26 +104,25 @@ systemd:
--mount volume=etc-iscsi,target=/etc/iscsi \
--volume usr-sbin-iscsiadm,kind=host,source=/usr/sbin/iscsiadm \
--mount volume=usr-sbin-iscsiadm,target=/sbin/iscsiadm \
docker://quay.io/poseidon/kubelet:v1.18.2 -- \
$${KUBELET_IMAGE} -- \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--healthz-port=0 \
--hostname-override=${domain_name} \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/master \
--node-labels=node.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
--register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
@ -147,7 +147,7 @@ systemd:
--volume script,kind=host,source=/opt/bootstrap/apply \
--mount volume=script,target=/apply \
--insecure-options=image \
docker://quay.io/poseidon/kubelet:v1.18.2 \
docker://quay.io/poseidon/kubelet:v1.18.6 \
--net=host \
--dns=host \
--exec=/apply
@ -158,6 +158,7 @@ storage:
directories:
- path: /etc/kubernetes
filesystem: root
mode: 0755
files:
- path: /etc/hostname
filesystem: root
@ -181,11 +182,11 @@ storage:
chmod -R 500 /etc/ssl/etcd
mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/
mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/
sudo mkdir -p /etc/kubernetes/manifests
sudo mv static-manifests/* /etc/kubernetes/manifests/
sudo mkdir -p /opt/bootstrap/assets
sudo mv manifests /opt/bootstrap/assets/manifests
sudo mv manifests-networking/* /opt/bootstrap/assets/manifests/
mkdir -p /etc/kubernetes/manifests
mv static-manifests/* /etc/kubernetes/manifests/
mkdir -p /opt/bootstrap/assets
mv manifests /opt/bootstrap/assets/manifests
mv manifests-networking/* /opt/bootstrap/assets/manifests/
rm -rf assets auth static-manifests tls manifests-networking
- path: /opt/bootstrap/apply
filesystem: root
@ -204,6 +205,7 @@ storage:
done
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
mode: 0644
contents:
inline: |
fs.inotify.max_user_watches=16184

View File

@ -2,7 +2,7 @@
systemd:
units:
- name: installer.service
enable: true
enabled: true
contents: |
[Unit]
Requires=network-online.target

View File

@ -2,11 +2,11 @@
systemd:
units:
- name: docker.service
enable: true
enabled: true
- name: locksmithd.service
mask: true
- name: kubelet.path
enable: true
enabled: true
contents: |
[Unit]
Description=Watch for kubeconfig
@ -15,7 +15,7 @@ systemd:
[Install]
WantedBy=multi-user.target
- name: wait-for-dns.service
enable: true
enabled: true
contents: |
[Unit]
Description=Wait for DNS entries
@ -30,9 +30,10 @@ systemd:
- name: kubelet.service
contents: |
[Unit]
Description=Kubelet via Hyperkube
Description=Kubelet
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.6
Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver}
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
@ -76,20 +77,19 @@ systemd:
--mount volume=etc-iscsi,target=/etc/iscsi \
--volume usr-sbin-iscsiadm,kind=host,source=/usr/sbin/iscsiadm \
--mount volume=usr-sbin-iscsiadm,target=/sbin/iscsiadm \
docker://quay.io/poseidon/kubelet:v1.18.2 -- \
$${KUBELET_IMAGE} -- \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=$${KUBELET_CGROUP_DRIVER} \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--healthz-port=0 \
--hostname-override=${domain_name} \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/node \
%{~ for label in compact(split(",", node_labels)) ~}
@ -100,6 +100,7 @@ systemd:
%{~ endfor ~}
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
@ -111,6 +112,7 @@ storage:
directories:
- path: /etc/kubernetes
filesystem: root
mode: 0755
files:
- path: /etc/hostname
filesystem: root
@ -120,6 +122,7 @@ storage:
${domain_name}
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
mode: 0644
contents:
inline: |
fs.inotify.max_user_watches=16184

View File

@ -141,10 +141,10 @@ resource "matchbox_profile" "controllers" {
}
data "ct_config" "controller-ignitions" {
count = length(var.controllers)
content = data.template_file.controller-configs.*.rendered[count.index]
pretty_print = false
snippets = lookup(var.snippets, var.controllers.*.name[count.index], [])
count = length(var.controllers)
content = data.template_file.controller-configs.*.rendered[count.index]
strict = true
snippets = lookup(var.snippets, var.controllers.*.name[count.index], [])
}
data "template_file" "controller-configs" {
@ -171,10 +171,10 @@ resource "matchbox_profile" "workers" {
}
data "ct_config" "worker-ignitions" {
count = length(var.workers)
content = data.template_file.worker-configs.*.rendered[count.index]
pretty_print = false
snippets = lookup(var.snippets, var.workers.*.name[count.index], [])
count = length(var.workers)
content = data.template_file.worker-configs.*.rendered[count.index]
strict = true
snippets = lookup(var.snippets, var.workers.*.name[count.index], [])
}
data "template_file" "worker-configs" {

View File

@ -4,7 +4,7 @@ terraform {
required_version = "~> 0.12.6"
required_providers {
matchbox = "~> 0.3.0"
ct = "~> 0.3"
ct = "~> 0.4"
template = "~> 2.1"
null = "~> 2.1"
}

View File

@ -11,9 +11,9 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.18.2 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Kubernetes v1.18.6 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests)
module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=14d0b2087962a0f2557c184f3f523548ce19bbdc"
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=835890025bfb3499aa0cd5a9704e9e7d3e7e5fe5"
cluster_name = var.cluster_name
api_servers = [var.k8s_domain_name]

View File

@ -28,7 +28,7 @@ systemd:
--network host \
--volume /var/lib/etcd:/var/lib/etcd:rw,Z \
--volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \
quay.io/coreos/etcd:v3.4.7
quay.io/coreos/etcd:v3.4.9
ExecStop=/usr/bin/podman stop etcd
[Install]
WantedBy=multi-user.target
@ -50,9 +50,10 @@ systemd:
- name: kubelet.service
contents: |
[Unit]
Description=Kubelet via Hyperkube (System Container)
Description=Kubelet (System Container)
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.6
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
@ -80,10 +81,11 @@ systemd:
--volume /opt/cni/bin:/opt/cni/bin:z \
--volume /etc/iscsi:/etc/iscsi \
--volume /sbin/iscsiadm:/sbin/iscsiadm \
quay.io/poseidon/kubelet:v1.18.2 \
$${KUBELET_IMAGE} \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=systemd \
--cgroups-per-qos=true \
--enforce-node-allocatable=pods \
@ -91,17 +93,15 @@ systemd:
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--healthz-port=0 \
--hostname-override=${domain_name} \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/master \
--node-labels=node.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
--register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/podman stop kubelet
Delegate=yes
@ -134,7 +134,7 @@ systemd:
--volume /opt/bootstrap/assets:/assets:ro,Z \
--volume /opt/bootstrap/apply:/apply:ro,Z \
--entrypoint=/apply \
quay.io/poseidon/kubelet:v1.18.2
quay.io/poseidon/kubelet:v1.18.6
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
ExecStartPost=-/usr/bin/podman stop bootstrap
storage:
@ -162,11 +162,11 @@ storage:
chmod -R 500 /etc/ssl/etcd
mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/
mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/
sudo mkdir -p /etc/kubernetes/manifests
sudo mv static-manifests/* /etc/kubernetes/manifests/
sudo mkdir -p /opt/bootstrap/assets
sudo mv manifests /opt/bootstrap/assets/manifests
sudo mv manifests-networking/* /opt/bootstrap/assets/manifests/
mkdir -p /etc/kubernetes/manifests
mv static-manifests/* /etc/kubernetes/manifests/
mkdir -p /opt/bootstrap/assets
mv manifests /opt/bootstrap/assets/manifests
mv manifests-networking/* /opt/bootstrap/assets/manifests/
rm -rf assets auth static-manifests tls manifests-networking
- path: /opt/bootstrap/apply
mode: 0544
@ -186,6 +186,11 @@ storage:
contents:
inline: |
fs.inotify.max_user_watches=16184
- path: /etc/sysctl.d/reverse-path-filter.conf
contents:
inline: |
net.ipv4.conf.default.rp_filter=0
net.ipv4.conf.*.rp_filter=0
- path: /etc/systemd/system.conf.d/accounting.conf
contents:
inline: |

View File

@ -20,9 +20,10 @@ systemd:
- name: kubelet.service
contents: |
[Unit]
Description=Kubelet via Hyperkube (System Container)
Description=Kubelet (System Container)
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.6
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
@ -50,10 +51,11 @@ systemd:
--volume /opt/cni/bin:/opt/cni/bin:z \
--volume /etc/iscsi:/etc/iscsi \
--volume /sbin/iscsiadm:/sbin/iscsiadm \
quay.io/poseidon/kubelet:v1.18.2 \
$${KUBELET_IMAGE} \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=systemd \
--cgroups-per-qos=true \
--enforce-node-allocatable=pods \
@ -61,11 +63,9 @@ systemd:
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--healthz-port=0 \
--hostname-override=${domain_name} \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/node \
%{~ for label in compact(split(",", node_labels)) ~}
@ -76,6 +76,7 @@ systemd:
%{~ endfor ~}
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/podman stop kubelet
Delegate=yes
@ -105,6 +106,11 @@ storage:
contents:
inline: |
fs.inotify.max_user_watches=16184
- path: /etc/sysctl.d/reverse-path-filter.conf
contents:
inline: |
net.ipv4.conf.default.rp_filter=0
net.ipv4.conf.*.rp_filter=0
- path: /etc/systemd/system.conf.d/accounting.conf
contents:
inline: |

View File

@ -11,8 +11,8 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.18.2 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* Kubernetes v1.18.6 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
* Ready for Ingress, Prometheus, Grafana, CSI, and other [addons](https://typhoon.psdn.io/addons/overview/)

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests)
module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=14d0b2087962a0f2557c184f3f523548ce19bbdc"
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=835890025bfb3499aa0cd5a9704e9e7d3e7e5fe5"
cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]

View File

@ -2,12 +2,12 @@
systemd:
units:
- name: etcd-member.service
enable: true
enabled: true
dropins:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.4.7"
Environment="ETCD_IMAGE_TAG=v3.4.9"
Environment="ETCD_IMAGE_URL=docker://quay.io/coreos/etcd"
Environment="RKT_RUN_ARGS=--insecure-options=image"
Environment="ETCD_NAME=${etcd_name}"
@ -28,11 +28,11 @@ systemd:
Environment="ETCD_PEER_KEY_FILE=/etc/ssl/certs/etcd/peer.key"
Environment="ETCD_PEER_CLIENT_CERT_AUTH=true"
- name: docker.service
enable: true
enabled: true
- name: locksmithd.service
mask: true
- name: kubelet.path
enable: true
enabled: true
contents: |
[Unit]
Description=Watch for kubeconfig
@ -41,7 +41,7 @@ systemd:
[Install]
WantedBy=multi-user.target
- name: wait-for-dns.service
enable: true
enabled: true
contents: |
[Unit]
Description=Wait for DNS entries
@ -57,11 +57,12 @@ systemd:
- name: kubelet.service
contents: |
[Unit]
Description=Kubelet via Hyperkube
Description=Kubelet
Requires=coreos-metadata.service
After=coreos-metadata.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.6
EnvironmentFile=/run/metadata/coreos
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
@ -101,25 +102,24 @@ systemd:
--mount volume=var-log,target=/var/log \
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
--mount volume=opt-cni-bin,target=/opt/cni/bin \
docker://quay.io/poseidon/kubelet:v1.18.2 -- \
$${KUBELET_IMAGE} -- \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--healthz-port=0 \
--hostname-override=$${COREOS_DIGITALOCEAN_IPV4_PRIVATE_0} \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/master \
--node-labels=node.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
--register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
@ -144,7 +144,7 @@ systemd:
--volume script,kind=host,source=/opt/bootstrap/apply \
--mount volume=script,target=/apply \
--insecure-options=image \
docker://quay.io/poseidon/kubelet:v1.18.2 \
docker://quay.io/poseidon/kubelet:v1.18.6 \
--net=host \
--dns=host \
--exec=/apply
@ -155,6 +155,7 @@ storage:
directories:
- path: /etc/kubernetes
filesystem: root
mode: 0755
files:
- path: /opt/bootstrap/layout
filesystem: root
@ -172,11 +173,11 @@ storage:
chmod -R 500 /etc/ssl/etcd
mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/
mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/
sudo mkdir -p /etc/kubernetes/manifests
sudo mv static-manifests/* /etc/kubernetes/manifests/
sudo mkdir -p /opt/bootstrap/assets
sudo mv manifests /opt/bootstrap/assets/manifests
sudo mv manifests-networking/* /opt/bootstrap/assets/manifests/
mkdir -p /etc/kubernetes/manifests
mv static-manifests/* /etc/kubernetes/manifests/
mkdir -p /opt/bootstrap/assets
mv manifests /opt/bootstrap/assets/manifests
mv manifests-networking/* /opt/bootstrap/assets/manifests/
rm -rf assets auth static-manifests tls manifests-networking
- path: /opt/bootstrap/apply
filesystem: root
@ -195,6 +196,7 @@ storage:
done
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
mode: 0644
contents:
inline: |
fs.inotify.max_user_watches=16184

View File

@ -2,11 +2,11 @@
systemd:
units:
- name: docker.service
enable: true
enabled: true
- name: locksmithd.service
mask: true
- name: kubelet.path
enable: true
enabled: true
contents: |
[Unit]
Description=Watch for kubeconfig
@ -15,7 +15,7 @@ systemd:
[Install]
WantedBy=multi-user.target
- name: wait-for-dns.service
enable: true
enabled: true
contents: |
[Unit]
Description=Wait for DNS entries
@ -30,11 +30,12 @@ systemd:
- name: kubelet.service
contents: |
[Unit]
Description=Kubelet via Hyperkube
Description=Kubelet
Requires=coreos-metadata.service
After=coreos-metadata.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.6
EnvironmentFile=/run/metadata/coreos
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
@ -74,23 +75,23 @@ systemd:
--mount volume=var-log,target=/var/log \
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
--mount volume=opt-cni-bin,target=/opt/cni/bin \
docker://quay.io/poseidon/kubelet:v1.18.2 -- \
$${KUBELET_IMAGE} -- \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--healthz-port=0 \
--hostname-override=$${COREOS_DIGITALOCEAN_IPV4_PRIVATE_0} \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/node \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
@ -98,7 +99,7 @@ systemd:
[Install]
WantedBy=multi-user.target
- name: delete-node.service
enable: true
enabled: true
contents: |
[Unit]
Description=Waiting to delete Kubernetes node on shutdown
@ -113,9 +114,11 @@ storage:
directories:
- path: /etc/kubernetes
filesystem: root
mode: 0755
files:
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
mode: 0644
contents:
inline: |
fs.inotify.max_user_watches=16184
@ -131,7 +134,7 @@ storage:
--volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \
--insecure-options=image \
docker://quay.io/poseidon/kubelet:v1.18.2 \
docker://quay.io/poseidon/kubelet:v1.18.6 \
--net=host \
--dns=host \
--exec=/usr/local/bin/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)

View File

@ -46,9 +46,10 @@ resource "digitalocean_droplet" "controllers" {
size = var.controller_type
# network
# only official DigitalOcean images support IPv6
ipv6 = local.is_official_image
private_networking = true
vpc_uuid = digitalocean_vpc.network.id
# TODO: Only official DigitalOcean images support IPv6
ipv6 = false
user_data = data.ct_config.controller-ignitions.*.rendered[count.index]
ssh_keys = var.ssh_fingerprints
@ -69,10 +70,10 @@ resource "digitalocean_tag" "controllers" {
# Controller Ignition configs
data "ct_config" "controller-ignitions" {
count = var.controller_count
content = data.template_file.controller-configs.*.rendered[count.index]
pretty_print = false
snippets = var.controller_snippets
count = var.controller_count
content = data.template_file.controller-configs.*.rendered[count.index]
strict = true
snippets = var.controller_snippets
}
# Controller Container Linux configs

View File

@ -1,3 +1,10 @@
# Network VPC
resource "digitalocean_vpc" "network" {
name = var.cluster_name
region = var.region
description = "Network for ${var.cluster_name} cluster"
}
resource "digitalocean_firewall" "rules" {
name = var.cluster_name
@ -6,6 +13,11 @@ resource "digitalocean_firewall" "rules" {
digitalocean_tag.workers.name
]
inbound_rule {
protocol = "icmp"
source_tags = [digitalocean_tag.controllers.name, digitalocean_tag.workers.name]
}
# allow ssh, internal flannel, internal node-exporter, internal kubelet
inbound_rule {
protocol = "tcp"
@ -13,12 +25,27 @@ resource "digitalocean_firewall" "rules" {
source_addresses = ["0.0.0.0/0", "::/0"]
}
# Cilium health
inbound_rule {
protocol = "tcp"
port_range = "4240"
source_tags = [digitalocean_tag.controllers.name, digitalocean_tag.workers.name]
}
# IANA vxlan (flannel, calico)
inbound_rule {
protocol = "udp"
port_range = "4789"
source_tags = [digitalocean_tag.controllers.name, digitalocean_tag.workers.name]
}
# Linux vxlan (Cilium)
inbound_rule {
protocol = "udp"
port_range = "8472"
source_tags = [digitalocean_tag.controllers.name, digitalocean_tag.workers.name]
}
# Allow Prometheus to scrape node-exporter
inbound_rule {
protocol = "tcp"
@ -33,6 +60,7 @@ resource "digitalocean_firewall" "rules" {
source_tags = [digitalocean_tag.workers.name]
}
# Kubelet
inbound_rule {
protocol = "tcp"
port_range = "10250"

View File

@ -2,6 +2,8 @@ output "kubeconfig-admin" {
value = module.bootstrap.kubeconfig-admin
}
# Outputs for Kubernetes Ingress
output "controllers_dns" {
value = digitalocean_record.controllers[0].fqdn
}
@ -45,3 +47,10 @@ output "worker_tag" {
value = digitalocean_tag.workers.name
}
# Outputs for custom load balancing
output "vpc_id" {
description = "ID of the cluster VPC"
value = digitalocean_vpc.network.id
}

View File

@ -3,8 +3,8 @@
terraform {
required_version = "~> 0.12.6"
required_providers {
digitalocean = "~> 1.3"
ct = "~> 0.3"
digitalocean = "~> 1.16"
ct = "~> 0.4"
template = "~> 2.1"
null = "~> 2.1"
}

View File

@ -35,9 +35,10 @@ resource "digitalocean_droplet" "workers" {
size = var.worker_type
# network
# only official DigitalOcean images support IPv6
ipv6 = local.is_official_image
private_networking = true
vpc_uuid = digitalocean_vpc.network.id
# only official DigitalOcean images support IPv6
ipv6 = local.is_official_image
user_data = data.ct_config.worker-ignition.rendered
ssh_keys = var.ssh_fingerprints
@ -58,9 +59,9 @@ resource "digitalocean_tag" "workers" {
# Worker Ignition config
data "ct_config" "worker-ignition" {
content = data.template_file.worker-config.rendered
pretty_print = false
snippets = var.worker_snippets
content = data.template_file.worker-config.rendered
strict = true
snippets = var.worker_snippets
}
# Worker Container Linux config

View File

@ -11,9 +11,9 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.18.2 (upstream)
* Kubernetes v1.18.6 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/) customization
* Ready for Ingress, Prometheus, Grafana, CSI, and other [addons](https://typhoon.psdn.io/addons/overview/)

View File

@ -1,6 +1,6 @@
# Kubernetes assets (kubeconfig, manifests)
module "bootstrap" {
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=14d0b2087962a0f2557c184f3f523548ce19bbdc"
source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=835890025bfb3499aa0cd5a9704e9e7d3e7e5fe5"
cluster_name = var.cluster_name
api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)]

View File

@ -41,9 +41,10 @@ resource "digitalocean_droplet" "controllers" {
size = var.controller_type
# network
# TODO: Only official DigitalOcean images support IPv6
ipv6 = false
private_networking = true
vpc_uuid = digitalocean_vpc.network.id
# TODO: Only official DigitalOcean images support IPv6
ipv6 = false
user_data = data.ct_config.controller-ignitions.*.rendered[count.index]
ssh_keys = var.ssh_fingerprints
@ -64,10 +65,10 @@ resource "digitalocean_tag" "controllers" {
# Controller Ignition configs
data "ct_config" "controller-ignitions" {
count = var.controller_count
content = data.template_file.controller-configs.*.rendered[count.index]
strict = true
snippets = var.controller_snippets
count = var.controller_count
content = data.template_file.controller-configs.*.rendered[count.index]
strict = true
snippets = var.controller_snippets
}
# Controller Fedora CoreOS configs

View File

@ -28,7 +28,7 @@ systemd:
--network host \
--volume /var/lib/etcd:/var/lib/etcd:rw,Z \
--volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \
quay.io/coreos/etcd:v3.4.7
quay.io/coreos/etcd:v3.4.9
ExecStop=/usr/bin/podman stop etcd
[Install]
WantedBy=multi-user.target
@ -50,11 +50,12 @@ systemd:
- name: kubelet.service
contents: |
[Unit]
Description=Kubelet via Hyperkube (System Container)
Description=Kubelet (System Container)
Requires=afterburn.service
After=afterburn.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.6
EnvironmentFile=/run/metadata/afterburn
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
@ -81,10 +82,11 @@ systemd:
--volume /var/log:/var/log \
--volume /var/run/lock:/var/run/lock:z \
--volume /opt/cni/bin:/opt/cni/bin:z \
quay.io/poseidon/kubelet:v1.18.2 \
$${KUBELET_IMAGE} \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=systemd \
--cgroups-per-qos=true \
--enforce-node-allocatable=pods \
@ -92,17 +94,15 @@ systemd:
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--healthz-port=0 \
--hostname-override=$${AFTERBURN_DIGITALOCEAN_IPV4_PRIVATE_0} \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/master \
--node-labels=node.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
--register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/podman stop kubelet
Delegate=yes
@ -135,7 +135,7 @@ systemd:
--volume /opt/bootstrap/assets:/assets:ro,Z \
--volume /opt/bootstrap/apply:/apply:ro,Z \
--entrypoint=/apply \
quay.io/poseidon/kubelet:v1.18.2
quay.io/poseidon/kubelet:v1.18.6
ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
ExecStartPost=-/usr/bin/podman stop bootstrap
storage:
@ -158,11 +158,11 @@ storage:
chmod -R 500 /etc/ssl/etcd
mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/
mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/
sudo mkdir -p /etc/kubernetes/manifests
sudo mv static-manifests/* /etc/kubernetes/manifests/
sudo mkdir -p /opt/bootstrap/assets
sudo mv manifests /opt/bootstrap/assets/manifests
sudo mv manifests-networking/* /opt/bootstrap/assets/manifests/
mkdir -p /etc/kubernetes/manifests
mv static-manifests/* /etc/kubernetes/manifests/
mkdir -p /opt/bootstrap/assets
mv manifests /opt/bootstrap/assets/manifests
mv manifests-networking/* /opt/bootstrap/assets/manifests/
rm -rf assets auth static-manifests tls manifests-networking
- path: /opt/bootstrap/apply
mode: 0544
@ -182,6 +182,11 @@ storage:
contents:
inline: |
fs.inotify.max_user_watches=16184
- path: /etc/sysctl.d/reverse-path-filter.conf
contents:
inline: |
net.ipv4.conf.default.rp_filter=0
net.ipv4.conf.*.rp_filter=0
- path: /etc/systemd/system.conf.d/accounting.conf
contents:
inline: |

View File

@ -21,11 +21,12 @@ systemd:
enabled: true
contents: |
[Unit]
Description=Kubelet via Hyperkube (System Container)
Description=Kubelet (System Container)
Requires=afterburn.service
After=afterburn.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.6
EnvironmentFile=/run/metadata/afterburn
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
@ -52,10 +53,11 @@ systemd:
--volume /var/log:/var/log \
--volume /var/run/lock:/var/run/lock:z \
--volume /opt/cni/bin:/opt/cni/bin:z \
quay.io/poseidon/kubelet:v1.18.2 \
$${KUBELET_IMAGE} \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--cgroup-driver=systemd \
--cgroups-per-qos=true \
--enforce-node-allocatable=pods \
@ -63,15 +65,14 @@ systemd:
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--healthz-port=0 \
--hostname-override=$${AFTERBURN_DIGITALOCEAN_IPV4_PRIVATE_0} \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--network-plugin=cni \
--node-labels=node.kubernetes.io/node \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--rotate-certificates \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/podman stop kubelet
Delegate=yes
@ -97,7 +98,7 @@ systemd:
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/true
ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.18.2 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME'
ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.18.6 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME'
[Install]
WantedBy=multi-user.target
storage:
@ -108,6 +109,11 @@ storage:
contents:
inline: |
fs.inotify.max_user_watches=16184
- path: /etc/sysctl.d/reverse-path-filter.conf
contents:
inline: |
net.ipv4.conf.default.rp_filter=0
net.ipv4.conf.*.rp_filter=0
- path: /etc/systemd/system.conf.d/accounting.conf
contents:
inline: |

View File

@ -1,3 +1,10 @@
# Network VPC
resource "digitalocean_vpc" "network" {
name = var.cluster_name
region = var.region
description = "Network for ${var.cluster_name} cluster"
}
resource "digitalocean_firewall" "rules" {
name = var.cluster_name
@ -6,6 +13,11 @@ resource "digitalocean_firewall" "rules" {
digitalocean_tag.workers.name
]
inbound_rule {
protocol = "icmp"
source_tags = [digitalocean_tag.controllers.name, digitalocean_tag.workers.name]
}
# allow ssh, internal flannel, internal node-exporter, internal kubelet
inbound_rule {
protocol = "tcp"
@ -13,12 +25,27 @@ resource "digitalocean_firewall" "rules" {
source_addresses = ["0.0.0.0/0", "::/0"]
}
# Cilium health
inbound_rule {
protocol = "tcp"
port_range = "4240"
source_tags = [digitalocean_tag.controllers.name, digitalocean_tag.workers.name]
}
# IANA vxlan (flannel, calico)
inbound_rule {
protocol = "udp"
port_range = "4789"
source_tags = [digitalocean_tag.controllers.name, digitalocean_tag.workers.name]
}
# Linux vxlan (Cilium)
inbound_rule {
protocol = "udp"
port_range = "8472"
source_tags = [digitalocean_tag.controllers.name, digitalocean_tag.workers.name]
}
# Allow Prometheus to scrape node-exporter
inbound_rule {
protocol = "tcp"
@ -33,6 +60,7 @@ resource "digitalocean_firewall" "rules" {
source_tags = [digitalocean_tag.workers.name]
}
# Kubelet
inbound_rule {
protocol = "tcp"
port_range = "10250"

View File

@ -2,6 +2,8 @@ output "kubeconfig-admin" {
value = module.bootstrap.kubeconfig-admin
}
# Outputs for Kubernetes Ingress
output "controllers_dns" {
value = digitalocean_record.controllers[0].fqdn
}
@ -45,3 +47,9 @@ output "worker_tag" {
value = digitalocean_tag.workers.name
}
# Outputs for custom load balancing
output "vpc_id" {
description = "ID of the cluster VPC"
value = digitalocean_vpc.network.id
}

View File

@ -3,8 +3,8 @@
terraform {
required_version = "~> 0.12.6"
required_providers {
digitalocean = "~> 1.3"
ct = "~> 0.3"
digitalocean = "~> 1.16"
ct = "~> 0.4"
template = "~> 2.1"
null = "~> 2.1"
}

View File

@ -37,9 +37,10 @@ resource "digitalocean_droplet" "workers" {
size = var.worker_type
# network
# TODO: Only official DigitalOcean images support IPv6
ipv6 = false
private_networking = true
vpc_uuid = digitalocean_vpc.network.id
# TODO: Only official DigitalOcean images support IPv6
ipv6 = false
user_data = data.ct_config.worker-ignition.rendered
ssh_keys = var.ssh_fingerprints
@ -60,9 +61,9 @@ resource "digitalocean_tag" "workers" {
# Worker Ignition config
data "ct_config" "worker-ignition" {
content = data.template_file.worker-config.rendered
strict = true
snippets = var.worker_snippets
content = data.template_file.worker-config.rendered
strict = true
snippets = var.worker_snippets
}
# Worker Fedora CoreOS config

View File

@ -174,3 +174,34 @@ module "nemo" {
To customize low-level Kubernetes control plane bootstrapping, see the [poseidon/terraform-render-bootstrap](https://github.com/poseidon/terraform-render-bootstrap) Terraform module.
## Kubelet
Typhoon publishes Kubelet [container images](/topics/security/#container-images) to Quay.io (default) and to Dockerhub (in case of a Quay [outage](https://github.com/poseidon/typhoon/issues/735) or breach). Quay automated builds also provide the option for fully verifiable tagged images (`build-{short_sha}`).
To set an alternative Kubelet image, use a snippet to set a systemd dropin.
```
# host-image-override.yaml
variant: fcos <- remove for Flatcar Linux
version: 1.0.0 <- remove for Flatcar Linux
systemd:
units:
- name: kubelet.service
dropins:
- name: 10-image-override.conf
contents: |
[Service]
Environment=KUBELET_IMAGE=docker.io/psdn/kubelet:v1.18.3
```
```
module "nemo" {
...
worker_snippets = [
file("./snippets/host-image-override.yaml")
]
...
}
```

View File

@ -13,7 +13,7 @@ Internal Terraform Modules:
## AWS
Create a cluster following the AWS [tutorial](../cl/aws.md#cluster). Define a worker pool using the AWS internal `workers` module.
Create a cluster following the AWS [tutorial](../flatcar-linux/aws.md#cluster). Define a worker pool using the AWS internal `workers` module.
```tf
module "tempest-worker-pool" {
@ -65,7 +65,8 @@ The AWS internal `workers` module supports a number of [variables](https://githu
|:-----|:------------|:--------|:--------|
| worker_count | Number of instances | 1 | 3 |
| instance_type | EC2 instance type | "t3.small" | "t3.medium" |
| os_image | AMI channel for a Container Linux derivative | "flatcar-stable" | flatcar-stable, flatcar-beta, flatcar-alph, coreos-stable, coreos-beta, coreos-alpha |
| os_image | AMI channel for a Container Linux derivative | "flatcar-stable" | flatcar-stable, flatcar-beta, flatcar-alph, flatcar-edge |
| os_stream | Fedora CoreOS stream for compute instances | "stable" | "testing", "next" |
| disk_size | Size of the EBS volume in GB | 40 | 100 |
| disk_type | Type of the EBS volume | "gp2" | standard, gp2, io1 |
| disk_iops | IOPS of the EBS volume | 0 (i.e. auto) | 400 |
@ -78,11 +79,11 @@ Check the list of valid [instance types](https://aws.amazon.com/ec2/instance-typ
## Azure
Create a cluster following the Azure [tutorial](../cl/azure.md#cluster). Define a worker pool using the Azure internal `workers` module.
Create a cluster following the Azure [tutorial](../flatcar-linux/azure.md#cluster). Define a worker pool using the Azure internal `workers` module.
```tf
module "ramius-worker-pool" {
source = "git::https://github.com/poseidon/typhoon//azure/container-linux/kubernetes/workers?ref=v1.18.2"
source = "git::https://github.com/poseidon/typhoon//azure/container-linux/kubernetes/workers?ref=v1.18.6"
# Azure
region = module.ramius.region
@ -134,7 +135,7 @@ The Azure internal `workers` module supports a number of [variables](https://git
|:-----|:------------|:--------|:--------|
| worker_count | Number of instances | 1 | 3 |
| vm_type | Machine type for instances | "Standard_DS1_v2" | See below |
| os_image | Channel for a Container Linux derivative | "flatcar-stable" | flatcar-stable, flatcar-beta, coreos-stable, coreos-beta, coreos-alpha |
| os_image | Channel for a Container Linux derivative | "flatcar-stable" | flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge |
| priority | Set priority to Spot to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time | "Regular" | "Spot" |
| snippets | Container Linux Config snippets | [] | [examples](/advanced/customization/) |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
@ -144,11 +145,11 @@ Check the list of valid [machine types](https://azure.microsoft.com/en-us/pricin
## Google Cloud
Create a cluster following the Google Cloud [tutorial](../cl/google-cloud.md#cluster). Define a worker pool using the Google Cloud internal `workers` module.
Create a cluster following the Google Cloud [tutorial](../flatcar-linux/google-cloud.md#cluster). Define a worker pool using the Google Cloud internal `workers` module.
```tf
module "yavin-worker-pool" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes/workers?ref=v1.18.2"
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes/workers?ref=v1.18.6"
# Google Cloud
region = "europe-west2"
@ -179,11 +180,11 @@ Verify a managed instance group of workers joins the cluster within a few minute
```
$ kubectl get nodes
NAME STATUS AGE VERSION
yavin-controller-0.c.example-com.internal Ready 6m v1.18.2
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.18.2
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.18.2
yavin-16x-worker-jrbf.c.example-com.internal Ready 3m v1.18.2
yavin-16x-worker-mzdm.c.example-com.internal Ready 3m v1.18.2
yavin-controller-0.c.example-com.internal Ready 6m v1.18.6
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.18.6
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.18.6
yavin-16x-worker-jrbf.c.example-com.internal Ready 3m v1.18.6
yavin-16x-worker-mzdm.c.example-com.internal Ready 3m v1.18.6
```
### Variables
@ -199,7 +200,7 @@ The Google Cloud internal `workers` module supports a number of [variables](http
| region | Region for the worker pool instances. May differ from the cluster's region | "europe-west2" |
| network | Must be set to `network_name` output by cluster | module.cluster.network_name |
| kubeconfig | Must be set to `kubeconfig` output by cluster | module.cluster.kubeconfig |
| os_image | Container Linux image for compute instances | "fedora-coreos-or-flatcar-image", coreos-stable, coreos-beta, coreos-alpha |
| os_image | Container Linux image for compute instances | "uploaded-flatcar-image" |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." |
Check the list of regions [docs](https://cloud.google.com/compute/docs/regions-zones/regions-zones) or with `gcloud compute regions list`.
@ -210,6 +211,7 @@ Check the list of regions [docs](https://cloud.google.com/compute/docs/regions-z
|:-----|:------------|:--------|:--------|
| worker_count | Number of instances | 1 | 3 |
| machine_type | Compute instance machine type | "n1-standard-1" | See below |
| os_stream | Fedora CoreOS stream for compute instances | "stable" | "testing", "next" |
| disk_size | Size of the disk in GB | 40 | 100 |
| preemptible | If true, Compute Engine will terminate instances randomly within 24 hours | false | true |
| snippets | Container Linux Config snippets | [] | [examples](/advanced/customization/) |

View File

@ -30,6 +30,7 @@ Add a DigitalOcean load balancer to distribute IPv4 TCP traffic (HTTP/HTTPS Ingr
resource "digitalocean_loadbalancer" "ingress" {
name = "ingress"
region = "fra1"
vpc_uuid = module.nemo.vpc_id
droplet_tag = module.nemo.worker_tag
healthcheck {

View File

@ -1,6 +1,6 @@
# Operating Systems
Typhoon supports [Fedora CoreOS](https://getfedora.org/coreos/), [Flatcar Linux](https://www.flatcar-linux.org/) and Container Linux (EOL in May 2020). These operating systems were chosen because they offer:
Typhoon supports [Fedora CoreOS](https://getfedora.org/coreos/) and [Flatcar Linux](https://www.flatcar-linux.org/). These operating systems were chosen because they offer:
* Minimalism and focus on clustered operation
* Automated and atomic operating system upgrades

View File

@ -1,6 +1,6 @@
# AWS
In this tutorial, we'll create a Kubernetes v1.18.2 cluster on AWS with Fedora CoreOS.
In this tutorial, we'll create a Kubernetes v1.18.6 cluster on AWS with Fedora CoreOS.
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a VPC, gateway, subnets, security groups, controller instances, worker auto-scaling group, network load balancer, and TLS assets.
@ -49,7 +49,7 @@ Configure the AWS provider to use your access key credentials in a `providers.tf
```tf
provider "aws" {
version = "2.53.0"
version = "2.70.0"
region = "eu-central-1"
shared_credentials_file = "/home/user/.config/aws/credentials"
}
@ -70,7 +70,7 @@ Define a Kubernetes cluster using the module `aws/fedora-coreos/kubernetes`.
```tf
module "tempest" {
source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.18.2"
source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.18.6"
# AWS
cluster_name = "tempest"
@ -143,9 +143,9 @@ List nodes in the cluster.
$ export KUBECONFIG=/home/user/.kube/configs/tempest-config
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-10-0-3-155 Ready <none> 10m v1.18.2
ip-10-0-26-65 Ready <none> 10m v1.18.2
ip-10-0-41-21 Ready <none> 10m v1.18.2
ip-10-0-3-155 Ready <none> 10m v1.18.6
ip-10-0-26-65 Ready <none> 10m v1.18.6
ip-10-0-41-21 Ready <none> 10m v1.18.6
```
List the pods.
@ -208,7 +208,7 @@ Reference the DNS zone id with `aws_route53_zone.zone-for-clusters.zone_id`.
| worker_count | Number of workers | 1 | 3 |
| controller_type | EC2 instance type for controllers | "t3.small" | See below |
| worker_type | EC2 instance type for workers | "t3.small" | See below |
| os_image | AMI channel for Fedora CoreOS | not yet used | ? |
| os_stream | Fedora CoreOS stream for compute instances | "stable" | "testing", "next" |
| disk_size | Size of the EBS volume in GB | 40 | 100 |
| disk_type | Type of the EBS volume | "gp2" | standard, gp2, io1 |
| disk_iops | IOPS of the EBS volume | 0 (i.e. auto) | 400 |
@ -216,7 +216,7 @@ Reference the DNS zone id with `aws_route53_zone.zone-for-clusters.zone_id`.
| worker_price | Spot price in USD for worker instances or 0 to use on-demand instances | 0 | 0.10 |
| controller_snippets | Controller Fedora CoreOS Config snippets | [] | [examples](/advanced/customization/) |
| worker_snippets | Worker Fedora CoreOS Config snippets | [] | [examples](/advanced/customization/) |
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
| networking | Choice of networking provider | "calico" | "calico" or "cilium" or "flannel" |
| network_mtu | CNI interface MTU (calico only) | 1480 | 8981 |
| host_cidr | CIDR IPv4 range to assign to EC2 instances | "10.0.0.0/16" | "10.1.0.0/16" |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |

View File

@ -1,6 +1,6 @@
# Azure
In this tutorial, we'll create a Kubernetes v1.18.2 cluster on Azure with Fedora CoreOS.
In this tutorial, we'll create a Kubernetes v1.18.6 cluster on Azure with Fedora CoreOS.
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a resource group, virtual network, subnets, security groups, controller availability set, worker scale set, load balancer, and TLS assets.
@ -47,7 +47,7 @@ Configure the Azure provider in a `providers.tf` file.
```tf
provider "azurerm" {
version = "2.5.0"
version = "2.19.0"
}
provider "ct" {
@ -83,7 +83,7 @@ Define a Kubernetes cluster using the module `azure/fedora-coreos/kubernetes`.
```tf
module "ramius" {
source = "git::https://github.com/poseidon/typhoon//azure/fedora-coreos/kubernetes?ref=v1.18.2"
source = "git::https://github.com/poseidon/typhoon//azure/fedora-coreos/kubernetes?ref=v1.18.6"
# Azure
cluster_name = "ramius"
@ -158,9 +158,9 @@ List nodes in the cluster.
$ export KUBECONFIG=/home/user/.kube/configs/ramius-config
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ramius-controller-0 Ready <none> 24m v1.18.2
ramius-worker-000001 Ready <none> 25m v1.18.2
ramius-worker-000002 Ready <none> 24m v1.18.2
ramius-controller-0 Ready <none> 24m v1.18.6
ramius-worker-000001 Ready <none> 25m v1.18.6
ramius-worker-000002 Ready <none> 24m v1.18.6
```
List the pods.
@ -242,7 +242,7 @@ Reference the DNS zone with `azurerm_dns_zone.clusters.name` and its resource gr
| worker_priority | Set priority to Spot to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time | Regular | Spot |
| controller_snippets | Controller Fedora CoreOS Config snippets | [] | [example](/advanced/customization/#usage) |
| worker_snippets | Worker Fedora CoreOS Config snippets | [] | [example](/advanced/customization/#usage) |
| networking | Choice of networking provider | "calico" | "flannel" or "calico" |
| networking | Choice of networking provider | "calico" | "calico" or "cilium" or "flannel" |
| host_cidr | CIDR IPv4 range to assign to instances | "10.0.0.0/16" | "10.0.0.0/20" |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |

Some files were not shown because too many files have changed in this diff Show More