Compare commits

...

54 Commits

Author SHA1 Message Date
0127ee82c1 Update nginx-ingress from v0.19.0 to v0.20.0 2018-10-16 21:35:29 -07:00
a10d6977b8 Update Prometheus from v2.4.2 to v2.4.3
* https://github.com/prometheus/prometheus/releases/tag/v2.4.3
2018-10-16 21:29:41 -07:00
05fe923c14 Update Grafana from v5.3.0 to v5.3.1
* https://github.com/grafana/grafana/releases/tag/v5.3.1
2018-10-16 21:23:44 -07:00
d10620fb58 Add support for Flatcar Linux bare-metal cached_install
* Support bare-metal cached_install=true mode with Flatcar Linux
where assets are fetched from the Matchbox assets cache instead
of from the upstream Flatcar download server
* Skipped in original Flatcar support to keep it simple
https://github.com/poseidon/typhoon/pull/209
2018-10-16 21:15:24 -07:00
9b6113a058 Update Kubernetes from v1.11.3 to v1.12.1
* Mount an empty dir for the controller-manager to work around
https://github.com/kubernetes/kubernetes/issues/68973
* Update coreos/pod-checkpointer to strip affinity from
checkpointed pod manifests. Kubernetes v1.12.0-rc.1 introduced
a default affinity that appears on checkpointed manifests; but
it prevented scheduling and checkpointed pods should not have an
affinity, they're run directly by the Kubelet on the local node
* https://github.com/kubernetes-incubator/bootkube/issues/1001
* https://github.com/kubernetes/kubernetes/pull/68173
2018-10-16 20:28:13 -07:00
5eb4078d68 Add docker/default seccomp to control plane and addons
* Annotate pods, deployments, and daemonsets to start containers
with the Docker runtime's default seccomp profile
* Overrides Kubernetes default behavior which started containers
with seccomp=unconfined
* https://docs.docker.com/engine/security/seccomp/#pass-a-profile-for-a-container
2018-10-16 20:07:29 -07:00
8f0d2b5db4 Update Grafana from v5.2.4 to v5.3.0 2018-10-13 23:03:31 -07:00
2e89e161e9 Remove Azure admin_password (disabled) now that its optional
* Requires terraform-provider-azurerm v1.16.0 or higher
https://github.com/terraform-providers/terraform-provider-azurerm/pull/1958
2018-10-13 22:40:58 -07:00
55bb4dfba6 Raise CoreDNS replica count to 2 or more
* Run at least two replicas of CoreDNS to better support
rolling updates (previously, kube-dns had a pod nanny)
* On multi-master clusters, set the CoreDNS replica count
to match the number of masters (e.g. a 3-master cluster
previously used replicas:1, now replicas:3)
* Add AntiAffinity preferred rule to favor distributing
CoreDNS pods across controller nodes nodes
2018-10-13 20:31:29 -07:00
43fe78a2cc Raise scheduler/controller-manager replicas in multi-master
* Continue to ensure scheduler and controller-manager run
at least two replicas to support performing kubectl edits
on single-master clusters (no change)
* For multi-master clusters, set scheduler / controller-manager
replica count to the number of masters (e.g. a 3-master cluster
previously used replicas:2, now replicas:3)
2018-10-13 16:16:29 -07:00
5a283b6443 Update etcd from v3.3.9 to v3.3.10
* https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.3.md#v3310-2018-10-10
2018-10-13 13:14:37 -07:00
ff0c271d7b Fix Calico network policy docs link 2018-10-07 21:02:16 +02:00
89088b6a99 Fix broken README links 2018-10-07 20:56:13 +02:00
ee569e3a59 Change rooted links to absolute links for mkdocs v1.0
* mkdocs v1.0 now evaluates links beginning with / as
absolute links that are only prepended with the site url
2018-10-07 20:51:54 +02:00
daeaec0832 Fix requirements for docs builds 2018-10-07 19:26:55 +02:00
db36036c81 Require terraform-provider-digitalocean plugin ~> 1.0
* Require a terraform-provider-digitalocean plugin version of
1.0 or higher within the same major version (e.g. allow 1.1 but
not 2.0)
* Change requirement from ~> 0.1.2 (which allowed up to but not
including 1.0 release)
2018-10-02 17:09:19 +02:00
7653e511be Update CoreDNS and Calico versions
* Update CoreDNS from 1.1.3 to 1.2.2
* Update Calico from v3.2.1 to v3.2.3
2018-10-02 16:07:48 +02:00
f8474d68c9 Fix typo in maintenance docs 2018-09-25 03:07:00 -07:00
032a24133b Update Prometheus from v2.3.2 to v2.4.2
* https://github.com/prometheus/prometheus/releases/tag/v2.4.0
* https://github.com/prometheus/prometheus/releases/tag/v2.4.1
* https://github.com/prometheus/prometheus/releases/tag/v2.4.2
2018-09-21 22:27:11 -07:00
f5f442b658 Improve the issue template prompt
* Clarify that reporting versions as latest is not
helpful
2018-09-15 10:05:07 -07:00
ad871dbfa9 Update Kubernetes from v1.11.2 to v1.11.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1113
2018-09-13 18:50:41 -07:00
5801be53ff Fix typo in ingress addon docs 2018-09-08 18:28:42 -07:00
c1b1669cf8 Fix Google Cloud preemption link in docs 2018-09-08 18:28:25 -07:00
dc03f7a4a9 Update nginx-ingress from 0.17.1 to 0.19.0
* If using --enable-ssl-passthrough or exposing TCP/UDP services,
be aware of https://github.com/kubernetes/ingress-nginx/pull/3038
* Workarounds until the fix merges are to stay on 0.17.1, use the
suggested development image, or revert to securityContext
`runAsNonRoot: false` for a while (less secure)
2018-09-08 17:57:01 -07:00
1b8234eb91 Update Grafana from v5.2.2 to v5.2.4
* https://github.com/grafana/grafana/releases/tag/v5.2.3
* https://github.com/grafana/grafana/releases/tag/v5.2.4
2018-09-08 15:41:20 -07:00
4ba090feb0 Update kube-state-metrics from v1.3.1 to v1.4.0 2018-08-29 09:37:50 -07:00
4882fe1053 Add docs for Azure Ingress and worker pools
* Azure worker pools must be in the same region as
the cluster itself unfortunately
2018-08-27 23:30:56 -07:00
019009e9ee Add outputs for Azure ingress IPv4 and worker pools 2018-08-27 23:30:32 -07:00
991a5c6cee Add new tutorial docs and links 2018-08-27 23:30:32 -07:00
c60ec642bc Fix Azure delete-node script to lowercase hostnames
* Fix issue where worker nodes didn't delete themselves on
scale-down or deallocation (e.g. low priority instances).
Lowercase the hostname and delete the Kubernetes node
* Kubelet registers the lowercase hostname as the node name,
but Azure workers get hostname CLUSTER-worker-GENERATED where
the generated identifier may contain uppercase characters
2018-08-27 23:30:32 -07:00
38b4ff4700 Add module for Typhoon Azure with Container Linux 2018-08-27 23:30:32 -07:00
7eb09237f4 Update Calico from v3.1.3 to v3.2.1
* Add new bird and felix readiness checks
* Read MTU from ConfigMap veth_mtu
* Add RBAC read for serviceaccounts
* Remove invalid description from CRDs
2018-08-25 17:53:11 -07:00
e58b424882 Fix firewall to allow etcd client traffic between controllers
* Broaden internal-etcd firewall rule to allow etcd client
traffic (2379) from other controller nodes
* Previously, kube-apiservers were only able to connect to their
node's local etcd peer. While master node outages were tolerated,
reaching a healthy peer took longer than neccessary in some cases
* Reduce time needed to bootstrap a cluster
2018-08-21 23:51:40 -07:00
b8eeafe4f9 Template etcd_servers list to replace null_resource.repeat
* Remove the last usage of null_resource.repeat, which has
always been an eyesore for creating the etcd server list
* Originally, #224 switched to templating the etcd_servers
list for all clouds, but had to revert on GCP in #237
* https://github.com/poseidon/typhoon/pull/224
* https://github.com/poseidon/typhoon/pull/237
2018-08-21 22:46:24 -07:00
bdf1e6986e Fix terraform fmt 2018-08-21 21:59:55 -07:00
99e3721181 Mention Fedora Atomic system container artifacts
* Typhoon for Fedora Atomic uses system containers, container
images containing metadata, but built directly from upstream
and published and serve through Quay.io
* https://github.com/poseidon/system-containers
2018-08-21 21:52:52 -07:00
ea365b551a Fix docs mentions of ELBs to NLBs
* Typhoon AWS clusters use an NLB rather than an ELB,
since v1.10.5
* Add a few missing links in CHANGES
2018-08-21 21:40:06 -07:00
bbf2c13eef Remove AWS security rule allowing ICMP packets to nodes
* Deny ICMP packets for consistency across Typhoon clusters on
various clouds and because there isn't much need to allow them
2018-08-21 21:16:16 -07:00
da5d2c5321 Remove GCP firewall rule allowing Nginx Ingress health
* Nginx Ingress addon no longer uses hostNework so Prometheus may
scrape port 10254 via the CNI network, rather than via the host
address
2018-08-21 21:06:03 -07:00
bceec9fdf5 Sort firewall / security rules and add comments
* No functional changes to network firewalls
2018-08-21 20:53:16 -07:00
49a9dc9b8b Fix typo in Prometheus alerting rules 2018-08-21 16:55:49 -07:00
bec5250e73 Remove unofficial bare-metal *_networkds variables
* Remove controller_networkds and worker_networkds variables. These
variables were always listed as experimental, unsupported, and excluded
from documentation in anticipation of Container Linux Config snippets
* Use Container Linux Config snippets on bare-metal instead. They
provide safer, more powerful, and more elegant host customization
2018-08-13 23:33:29 -07:00
dbdc3fc850 Add nginx-ingress addon manifests for bare-metal 2018-08-11 12:14:23 -07:00
e00f97c578 Update nginx-ingress from 0.16.2 to 0.17.1
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.17.1
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.17.0
2018-08-08 00:45:20 -07:00
f7ebdf475d Update Kubernetes from v1.11.1 to v1.11.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1112
2018-08-07 21:57:25 -07:00
716dfe4d17 Fix cluster links in customization docs 2018-07-29 12:46:59 -07:00
edc250d62a Fix Kublet version for Fedora Atomic modules
* Release v1.11.1 erroneously left Fedora Atomic clusters using
the v1.11.0 Kubelet. The rest of the control plane ran v1.11.1
as expected
* Update Kubelet from v1.11.0 to v1.11.1 so Fedora Atomic matches
Container Linux
* Container Linux modules were not affected
2018-07-29 12:13:29 -07:00
db64ce3312 Update etcd from v3.3.8 to v3.3.9
* https://github.com/coreos/etcd/blob/master/CHANGELOG-3.3.md#v339-2018-07-24
2018-07-29 11:27:37 -07:00
7c327b8bf4 Update from bootkube v0.12.0 to v0.13.0 2018-07-29 11:20:17 -07:00
e6720cf738 Update heapster from v1.5.3 to v1.5.4
* https://github.com/kubernetes/heapster/releases/tag/v1.5.4
2018-07-29 11:19:57 -07:00
844f380b4e Update Grafana from v5.2.1 to v5.2.2
* https://github.com/grafana/grafana/releases/tag/v5.2.2
2018-07-29 11:12:56 -07:00
13beb13aab Add descriptions to bare-metal fedora-atomic variables 2018-07-29 11:07:48 -07:00
90c4a7483d Combine bare-metal CLC snippets maps into one map 2018-07-26 23:31:08 -07:00
4e7dfc115d Support Container Linux Config snippets on bare-metal 2018-07-25 23:14:54 -07:00
127 changed files with 3098 additions and 483 deletions

View File

@ -4,11 +4,11 @@
### Environment
* Platform: aws, bare-metal, google-cloud, digital-ocean
* Platform: aws, azure, bare-metal, google-cloud, digital-ocean
* OS: container-linux, fedora-atomic
* Terraform: `terraform version`
* Plugins: Provider plugin versions
* Ref: Git SHA (if applicable)
* Ref: Release version or Git SHA (reporting latest is **not** helpful)
* Terraform: `terraform version` (reporting latest is **not** helpful)
* Plugins: Provider plugin versions (reporting latest is **not** helpful)
### Problem

View File

@ -4,12 +4,98 @@ Notable changes between versions.
## Latest
* Kubernetes [v1.12.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.12.md#v1121)
* Update etcd from v3.3.9 to [v3.3.10](https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.3.md#v3310-2018-10-10)
* Update CoreDNS from 1.1.3 to 1.2.2
* Update Calico from v3.2.1 to v3.2.3
* On multi-controller clusters, raise scheduler and controller-manager replics to equal the number of controller nodes ([#312](https://github.com/poseidon/typhoon/pull/312))
* Single-controller clusters continue to run 2 replicas as before
* Raise default CoreDNS replica count to the larger of 2 or the number of controller nodes ([#313](https://github.com/poseidon/typhoon/pull/313))
* Add AntiAffinity preferred rule to favor spreading CoreDNS pods
* Annotate Kubernetes control plane and addons to start containers with the Docker runtime's default seccomp profile ([#319](https://github.com/poseidon/typhoon/pull/319))
* Override Kubernetes default behavior that starts containers with seccomp=unconfined
#### Azure
* Remove admin_password field (disabled) since it is now optional
* Require `terraform-provider-azurerm` v1.16+ (action required)
#### Bare-Metal
* Add support for `cached_install` mode with Flatcar Linux ([#315](https://github.com/poseidon/typhoon/pull/315))
#### DigitalOcean
* Require `terraform-provider-digitalocean` v1.0+ (action required)
#### Addons
* Update nginx-ingress from v0.19.0 to v0.20.0
* Update Prometheus from v2.3.2 to v2.4.3
* Update Grafana from v5.2.4 to v5.3.1
## v1.11.3
* Kubernetes [v1.11.3](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1113)
* Introduce Typhoon for Azure as alpha ([#288](https://github.com/poseidon/typhoon/pull/288))
* Special thanks @justaugustus for an earlier variant
* Update Calico from v3.1.3 to v3.2.1 ([#278](https://github.com/poseidon/typhoon/pull/278))
#### AWS
* Remove firewall rule allowing ICMP packets to nodes ([#285](https://github.com/poseidon/typhoon/pull/285))
#### Bare-Metal
* Remove `controller_networkds` and `worker_networkds` variables. Use Container Linux Config snippets [#277](https://github.com/poseidon/typhoon/pull/277)
#### Google Cloud
* Fix firewall to allow etcd client port 2379 traffic between controller nodes ([#287](https://github.com/poseidon/typhoon/pull/287))
* kube-apiservers were only able to connect to their node's local etcd peer. While master node outages were tolerated, reaching a healthy peer took longer than neccessary in some cases
* Reduce time needed to bootstrap the cluster
* Remove firewall rule allowing workers to access Nginx Ingress health check ([#284](https://github.com/poseidon/typhoon/pull/284))
* Nginx Ingress addon no longer uses hostNetwork, Prometheus scrapes via CNI network
#### Addons
* Update nginx-ingress from 0.17.1 to 0.19.0
* Update kube-state-metrics from v1.3.1 to v1.4.0
* Update Grafana from 5.2.2 to 5.2.4
## v1.11.2
* Kubernetes [v1.11.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1112)
* Update etcd from v3.3.8 to [v3.3.9](https://github.com/coreos/etcd/blob/master/CHANGELOG-3.3.md#v339-2018-07-24)
* Use kubernetes-incubator/bootkube v0.13.0
* Fix Fedora Atomic modules' Kubelet version ([#270](https://github.com/poseidon/typhoon/issues/270))
#### Bare-Metal
* Introduce [Container Linux Config snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) on bare-metal
* Validate and additively merge custom Container Linux Configs during terraform plan
* Define files, systemd units, dropins, networkd configs, mounts, users, and more
* [Require](https://typhoon.psdn.io/cl/bare-metal/#terraform-setup) `terraform-provider-ct` plugin v0.2.1 (**action required!**)
#### Addons
* Update nginx-ingress from 0.16.2 to 0.17.1
* Add nginx-ingress manifests for bare-metal
* Update Grafana from 5.2.1 to 5.2.2
* Update heapster from v1.5.3 to v1.5.4
## v1.11.1
* Kubernetes [v1.11.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1111)
#### Addons
* Update Prometheus from v2.3.1 to v2.3.2
#### Errata
* Fedora Atomic modules shipped with Kubelet v1.11.0, instead of v1.11.1. Fixed in [#270](https://github.com/poseidon/typhoon/issues/270).
## v1.11.0
* Kubernetes [v1.11.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1110)

View File

@ -11,10 +11,10 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.11.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Kubernetes v1.12.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/) and [preemption](https://typhoon.psdn.io/google-cloud/#preemption) (varies by platform)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/) and [preemption](https://typhoon.psdn.io/cl/google-cloud/#preemption) (varies by platform)
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
## Modules
@ -25,6 +25,7 @@ Typhoon provides a Terraform Module for each supported operating system and plat
|---------------|------------------|------------------|--------|
| AWS | Container Linux | [aws/container-linux/kubernetes](aws/container-linux/kubernetes) | stable |
| AWS | Fedora Atomic | [aws/fedora-atomic/kubernetes](aws/fedora-atomic/kubernetes) | alpha |
| Azure | Container Linux | [azure/container-linux/kubernetes](cl/azure.md) | alpha |
| Bare-Metal | Container Linux | [bare-metal/container-linux/kubernetes](bare-metal/container-linux/kubernetes) | stable |
| Bare-Metal | Fedora Atomic | [bare-metal/fedora-atomic/kubernetes](bare-metal/fedora-atomic/kubernetes) | alpha |
| Digital Ocean | Container Linux | [digital-ocean/container-linux/kubernetes](digital-ocean/container-linux/kubernetes) | beta |
@ -38,7 +39,7 @@ The AWS and bare-metal `container-linux` modules allow picking Red Hat Container
* [Docs](https://typhoon.psdn.io)
* Architecture [concepts](https://typhoon.psdn.io/architecture/concepts/) and [operating systems](https://typhoon.psdn.io/architecture/operating-systems/)
* Tutorials for [AWS](https://typhoon.psdn.io/cl/aws/), [Bare-Metal](https://typhoon.psdn.io/cl/bare-metal/), [Digital Ocean](https://typhoon.psdn.io/cl/digital-ocean/), and [Google-Cloud](https://typhoon.psdn.io/cl/google-cloud/)
* Tutorials for [AWS](docs/cl/aws.md), [Azure](docs/cl/azure.md), [Bare-Metal](docs/cl/bare-metal.md), [Digital Ocean](docs/cl/digital-ocean.md), and [Google-Cloud](docs/cl/google-cloud.md)
## Usage
@ -46,7 +47,7 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo
```tf
module "google-cloud-yavin" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.11.1"
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.12.1"
providers = {
google = "google.default"
@ -71,15 +72,14 @@ module "google-cloud-yavin" {
}
```
Fetch modules, plan the changes to be made, and apply the changes.
Initialize modules, plan the changes to be made, and apply the changes.
```sh
$ terraform init
$ terraform get --update
$ terraform plan
Plan: 37 to add, 0 to change, 0 to destroy.
Plan: 64 to add, 0 to change, 0 to destroy.
$ terraform apply
Apply complete! Resources: 37 added, 0 changed, 0 destroyed.
Apply complete! Resources: 64 added, 0 changed, 0 destroyed.
```
In 4-8 minutes (varies by platform), the cluster will be ready. This Google Cloud example creates a `yavin.example.com` DNS record to resolve to a network load balancer across controller nodes.
@ -88,9 +88,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
yavin-controller-0.c.example-com.internal Ready 6m v1.11.1
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.11.1
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.11.1
yavin-controller-0.c.example-com.internal Ready 6m v1.12.1
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.12.1
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.12.1
```
List the pods.

View File

@ -15,6 +15,8 @@ spec:
metadata:
labels:
app: container-linux-update-agent
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
containers:
- name: update-agent

View File

@ -12,6 +12,8 @@ spec:
metadata:
labels:
app: container-linux-update-operator
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
containers:
- name: update-operator

View File

@ -18,10 +18,12 @@ spec:
labels:
name: grafana
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
containers:
- name: grafana
image: grafana/grafana:5.2.1
image: grafana/grafana:5.3.1
env:
- name: GF_SERVER_HTTP_PORT
value: "8080"

View File

@ -20,7 +20,7 @@ spec:
serviceAccountName: heapster
containers:
- name: heapster
image: k8s.gcr.io/heapster-amd64:v1.5.3
image: k8s.gcr.io/heapster-amd64:v1.5.4
command:
- /heapster
- --source=kubernetes.summary_api:''

View File

@ -14,6 +14,8 @@ spec:
labels:
name: default-backend
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
containers:
- name: default-backend

View File

@ -17,12 +17,14 @@ spec:
labels:
name: nginx-ingress-controller
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
nodeSelector:
node-role.kubernetes.io/node: ""
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.16.2
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.20.0
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-backend

View File

@ -0,0 +1,6 @@
apiVersion: v1
kind: Namespace
metadata:
name: ingress
labels:
name: ingress

View File

@ -0,0 +1,42 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: default-backend
namespace: ingress
spec:
replicas: 1
selector:
matchLabels:
name: default-backend
phase: prod
template:
metadata:
labels:
name: default-backend
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
containers:
- name: default-backend
# Any image is permissable as long as:
# 1. It serves a 404 page at /
# 2. It serves 200 on a /healthz endpoint
image: k8s.gcr.io/defaultbackend:1.4
ports:
- containerPort: 8080
resources:
limits:
cpu: 10m
memory: 20Mi
requests:
cpu: 10m
memory: 20Mi
livenessProbe:
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 30
timeoutSeconds: 5
terminationGracePeriodSeconds: 60

View File

@ -0,0 +1,15 @@
apiVersion: v1
kind: Service
metadata:
name: default-backend
namespace: ingress
spec:
type: ClusterIP
selector:
name: default-backend
phase: prod
ports:
- name: http
protocol: TCP
port: 80
targetPort: 8080

View File

@ -0,0 +1,79 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-ingress-controller
namespace: ingress
spec:
replicas: 2
strategy:
rollingUpdate:
maxUnavailable: 1
selector:
matchLabels:
name: nginx-ingress-controller
phase: prod
template:
metadata:
labels:
name: nginx-ingress-controller
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
nodeSelector:
node-role.kubernetes.io/node: ""
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.20.0
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-backend
- --ingress-class=public
# use downward API
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
ports:
- name: http
containerPort: 80
hostPort: 80
- name: https
containerPort: 443
hostPort: 443
- name: health
containerPort: 10254
hostPort: 10254
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
readinessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 10254
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
securityContext:
capabilities:
add:
- NET_BIND_SERVICE
drop:
- ALL
runAsUser: 33 # www-data
restartPolicy: Always
terminationGracePeriodSeconds: 60

View File

@ -0,0 +1,12 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: ingress
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: ingress
subjects:
- kind: ServiceAccount
namespace: ingress
name: default

View File

@ -0,0 +1,51 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: ingress
rules:
- apiGroups:
- ""
resources:
- configmaps
- endpoints
- nodes
- pods
- secrets
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- apiGroups:
- ""
resources:
- services
verbs:
- get
- list
- watch
- apiGroups:
- "extensions"
resources:
- ingresses
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- apiGroups:
- "extensions"
resources:
- ingresses/status
verbs:
- update

View File

@ -0,0 +1,13 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: ingress
namespace: ingress
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: ingress
subjects:
- kind: ServiceAccount
namespace: ingress
name: default

View File

@ -0,0 +1,41 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: ingress
namespace: ingress
rules:
- apiGroups:
- ""
resources:
- configmaps
- pods
- secrets
verbs:
- get
- apiGroups:
- ""
resources:
- configmaps
resourceNames:
# Defaults to "<election-id>-<ingress-class>"
# Here: "<ingress-controller-leader>-<nginx>"
# This has to be adapted if you change either parameter
# when launching the nginx-ingress-controller.
- "ingress-controller-leader-public"
verbs:
- get
- update
- apiGroups:
- ""
resources:
- configmaps
verbs:
- create
- apiGroups:
- ""
resources:
- endpoints
verbs:
- get
- create
- update

View File

@ -0,0 +1,22 @@
apiVersion: v1
kind: Service
metadata:
name: nginx-ingress-controller
namespace: ingress
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '10254'
spec:
type: ClusterIP
selector:
name: nginx-ingress-controller
phase: prod
ports:
- name: http
protocol: TCP
port: 80
targetPort: 80
- name: https
protocol: TCP
port: 443
targetPort: 443

View File

@ -0,0 +1,6 @@
apiVersion: v1
kind: Namespace
metadata:
name: ingress
labels:
name: ingress

View File

@ -0,0 +1,42 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: default-backend
namespace: ingress
spec:
replicas: 1
selector:
matchLabels:
name: default-backend
phase: prod
template:
metadata:
labels:
name: default-backend
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
containers:
- name: default-backend
# Any image is permissable as long as:
# 1. It serves a 404 page at /
# 2. It serves 200 on a /healthz endpoint
image: k8s.gcr.io/defaultbackend:1.4
ports:
- containerPort: 8080
resources:
limits:
cpu: 10m
memory: 20Mi
requests:
cpu: 10m
memory: 20Mi
livenessProbe:
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 30
timeoutSeconds: 5
terminationGracePeriodSeconds: 60

View File

@ -0,0 +1,15 @@
apiVersion: v1
kind: Service
metadata:
name: default-backend
namespace: ingress
spec:
type: ClusterIP
selector:
name: default-backend
phase: prod
ports:
- name: http
protocol: TCP
port: 80
targetPort: 8080

View File

@ -0,0 +1,75 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: ingress-controller-public
namespace: ingress
spec:
replicas: 2
strategy:
rollingUpdate:
maxUnavailable: 1
selector:
matchLabels:
name: ingress-controller-public
phase: prod
template:
metadata:
labels:
name: ingress-controller-public
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.20.0
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-backend
- --ingress-class=public
# use downward API
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
ports:
- name: http
containerPort: 80
- name: https
containerPort: 443
- name: health
containerPort: 10254
livenessProbe:
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
timeoutSeconds: 1
readinessProbe:
httpGet:
path: /healthz
port: 10254
scheme: HTTP
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
timeoutSeconds: 1
securityContext:
capabilities:
add:
- NET_BIND_SERVICE
drop:
- ALL
runAsUser: 33 # www-data
restartPolicy: Always
terminationGracePeriodSeconds: 60

View File

@ -0,0 +1,12 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: ingress
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: ingress
subjects:
- kind: ServiceAccount
namespace: ingress
name: default

View File

@ -0,0 +1,51 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: ingress
rules:
- apiGroups:
- ""
resources:
- configmaps
- endpoints
- nodes
- pods
- secrets
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- apiGroups:
- ""
resources:
- services
verbs:
- get
- list
- watch
- apiGroups:
- "extensions"
resources:
- ingresses
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- apiGroups:
- "extensions"
resources:
- ingresses/status
verbs:
- update

View File

@ -0,0 +1,13 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: ingress
namespace: ingress
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: ingress
subjects:
- kind: ServiceAccount
namespace: ingress
name: default

View File

@ -0,0 +1,41 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: ingress
namespace: ingress
rules:
- apiGroups:
- ""
resources:
- configmaps
- pods
- secrets
verbs:
- get
- apiGroups:
- ""
resources:
- configmaps
resourceNames:
# Defaults to "<election-id>-<ingress-class>"
# Here: "<ingress-controller-leader>-<nginx>"
# This has to be adapted if you change either parameter
# when launching the nginx-ingress-controller.
- "ingress-controller-leader-public"
verbs:
- get
- update
- apiGroups:
- ""
resources:
- configmaps
verbs:
- create
- apiGroups:
- ""
resources:
- endpoints
verbs:
- get
- create
- update

View File

@ -0,0 +1,23 @@
apiVersion: v1
kind: Service
metadata:
name: ingress-controller-public
namespace: ingress
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '10254'
spec:
type: ClusterIP
clusterIP: 10.3.0.12
selector:
name: ingress-controller-public
phase: prod
ports:
- name: http
protocol: TCP
port: 80
targetPort: 80
- name: https
protocol: TCP
port: 443
targetPort: 443

View File

@ -17,12 +17,14 @@ spec:
labels:
name: nginx-ingress-controller
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
nodeSelector:
node-role.kubernetes.io/node: ""
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.16.2
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.20.0
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-backend

View File

@ -14,6 +14,8 @@ spec:
labels:
name: default-backend
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
containers:
- name: default-backend

View File

@ -14,6 +14,8 @@ spec:
labels:
name: default-backend
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
containers:
- name: default-backend

View File

@ -17,12 +17,14 @@ spec:
labels:
name: nginx-ingress-controller
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
nodeSelector:
node-role.kubernetes.io/node: ""
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.16.2
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.20.0
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-backend

View File

@ -14,11 +14,13 @@ spec:
labels:
name: prometheus
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
serviceAccountName: prometheus
containers:
- name: prometheus
image: quay.io/prometheus/prometheus:v2.3.2
image: quay.io/prometheus/prometheus:v2.4.3
args:
- --web.listen-address=0.0.0.0:9090
- --config.file=/etc/prometheus/prometheus.yaml

View File

@ -18,11 +18,13 @@ spec:
labels:
name: kube-state-metrics
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
serviceAccountName: kube-state-metrics
containers:
- name: kube-state-metrics
image: quay.io/coreos/kube-state-metrics:v1.3.1
image: quay.io/coreos/kube-state-metrics:v1.4.0
ports:
- name: metrics
containerPort: 8080

View File

@ -17,6 +17,8 @@ spec:
labels:
name: node-exporter
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
serviceAccountName: node-exporter
securityContext:

View File

@ -299,7 +299,7 @@ data:
labels:
severity: warning
annotations:
description: Pod {{$labels.namespaces}}/{{$labels.pod}} is was restarted {{$value}}
description: Pod {{$labels.namespaces}}/{{$labels.pod}} restarted {{$value}}
times within the last hour
summary: Pod is restarting frequently
kubelet.rules.yaml: |

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.11.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Kubernetes v1.12.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/)

View File

@ -1,6 +1,6 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=9e6fc7e697a9d6a0201768def7adfb3f168c56d7"
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=81f19507faabf411db9c760d55f3d03f7d78f4c9"
cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]

View File

@ -7,7 +7,7 @@ systemd:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.3.8"
Environment="ETCD_IMAGE_TAG=v3.3.10"
Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
@ -122,7 +122,7 @@ storage:
contents:
inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.11.1
KUBELET_IMAGE_TAG=v1.12.1
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -143,7 +143,7 @@ storage:
# Move experimental manifests
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.12.0}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.13.0}"
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
exec /usr/bin/rkt run \
--trust-keys-from-https \

View File

@ -5,7 +5,7 @@ resource "aws_route53_record" "apiserver" {
name = "${format("%s.%s.", var.cluster_name, var.dns_zone)}"
type = "A"
# AWS recommends their special "alias" records for ELBs
# AWS recommends their special "alias" records for NLBs
alias {
name = "${aws_lb.nlb.dns_name}"
zone_id = "${aws_lb.nlb.zone_id}"

View File

@ -11,16 +11,6 @@ resource "aws_security_group" "controller" {
tags = "${map("Name", "${var.cluster_name}-controller")}"
}
resource "aws_security_group_rule" "controller-icmp" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "icmp"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "controller-ssh" {
security_group_id = "${aws_security_group.controller.id}"
@ -31,16 +21,6 @@ resource "aws_security_group_rule" "controller-ssh" {
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "controller-apiserver" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 6443
to_port = 6443
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "controller-etcd" {
security_group_id = "${aws_security_group.controller.id}"
@ -51,6 +31,7 @@ resource "aws_security_group_rule" "controller-etcd" {
self = true
}
# Allow Prometheus to scrape etcd metrics
resource "aws_security_group_rule" "controller-etcd-metrics" {
security_group_id = "${aws_security_group.controller.id}"
@ -61,6 +42,16 @@ resource "aws_security_group_rule" "controller-etcd-metrics" {
source_security_group_id = "${aws_security_group.worker.id}"
}
resource "aws_security_group_rule" "controller-apiserver" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 6443
to_port = 6443
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "controller-flannel" {
security_group_id = "${aws_security_group.controller.id}"
@ -81,6 +72,7 @@ resource "aws_security_group_rule" "controller-flannel-self" {
self = true
}
# Allow Prometheus to scrape node-exporter daemonset
resource "aws_security_group_rule" "controller-node-exporter" {
security_group_id = "${aws_security_group.controller.id}"
@ -91,6 +83,7 @@ resource "aws_security_group_rule" "controller-node-exporter" {
source_security_group_id = "${aws_security_group.worker.id}"
}
# Allow apiserver to access kubelets for exec, log, port-forward
resource "aws_security_group_rule" "controller-kubelet" {
security_group_id = "${aws_security_group.controller.id}"
@ -111,6 +104,7 @@ resource "aws_security_group_rule" "controller-kubelet-self" {
self = true
}
# Allow heapster / metrics-server to scrape kubelet read-only
resource "aws_security_group_rule" "controller-kubelet-read" {
security_group_id = "${aws_security_group.controller.id}"
@ -213,16 +207,6 @@ resource "aws_security_group" "worker" {
tags = "${map("Name", "${var.cluster_name}-worker")}"
}
resource "aws_security_group_rule" "worker-icmp" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "icmp"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "worker-ssh" {
security_group_id = "${aws_security_group.worker.id}"
@ -273,6 +257,7 @@ resource "aws_security_group_rule" "worker-flannel-self" {
self = true
}
# Allow Prometheus to scrape node-exporter daemonset
resource "aws_security_group_rule" "worker-node-exporter" {
security_group_id = "${aws_security_group.worker.id}"
@ -293,6 +278,7 @@ resource "aws_security_group_rule" "ingress-health" {
cidr_blocks = ["0.0.0.0/0"]
}
# Allow apiserver to access kubelets for exec, log, port-forward
resource "aws_security_group_rule" "worker-kubelet" {
security_group_id = "${aws_security_group.worker.id}"
@ -303,6 +289,7 @@ resource "aws_security_group_rule" "worker-kubelet" {
source_security_group_id = "${aws_security_group.controller.id}"
}
# Allow Prometheus to scrape kubelet metrics
resource "aws_security_group_rule" "worker-kubelet-self" {
security_group_id = "${aws_security_group.worker.id}"
@ -313,6 +300,7 @@ resource "aws_security_group_rule" "worker-kubelet-self" {
self = true
}
# Allow heapster / metrics-server to scrape kubelet read-only
resource "aws_security_group_rule" "worker-kubelet-read" {
security_group_id = "${aws_security_group.worker.id}"

View File

@ -92,7 +92,7 @@ storage:
contents:
inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.11.1
KUBELET_IMAGE_TAG=v1.12.1
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -110,7 +110,7 @@ storage:
--volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \
--insecure-options=image \
docker://k8s.gcr.io/hyperkube:v1.11.1 \
docker://k8s.gcr.io/hyperkube:v1.12.1 \
--net=host \
--dns=host \
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.11.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Kubernetes v1.12.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/)

View File

@ -1,6 +1,6 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=9e6fc7e697a9d6a0201768def7adfb3f168c56d7"
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=81f19507faabf411db9c760d55f3d03f7d78f4c9"
cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]

View File

@ -92,9 +92,9 @@ bootcmd:
runcmd:
- [systemctl, daemon-reload]
- [systemctl, restart, NetworkManager]
- "atomic install --system --name=etcd quay.io/poseidon/etcd:v3.3.8"
- "atomic install --system --name=kubelet quay.io/poseidon/kubelet:v1.11.0"
- "atomic install --system --name=bootkube quay.io/poseidon/bootkube:v0.12.0"
- "atomic install --system --name=etcd quay.io/poseidon/etcd:v3.3.10"
- "atomic install --system --name=kubelet quay.io/poseidon/kubelet:v1.12.1"
- "atomic install --system --name=bootkube quay.io/poseidon/bootkube:v0.13.0"
- [systemctl, start, --no-block, etcd.service]
- [systemctl, enable, cloud-metadata.service]
- [systemctl, start, --no-block, kubelet.service]

View File

@ -5,7 +5,7 @@ resource "aws_route53_record" "apiserver" {
name = "${format("%s.%s.", var.cluster_name, var.dns_zone)}"
type = "A"
# AWS recommends their special "alias" records for ELBs
# AWS recommends their special "alias" records for NLBs
alias {
name = "${aws_lb.nlb.dns_name}"
zone_id = "${aws_lb.nlb.zone_id}"

View File

@ -11,16 +11,6 @@ resource "aws_security_group" "controller" {
tags = "${map("Name", "${var.cluster_name}-controller")}"
}
resource "aws_security_group_rule" "controller-icmp" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "icmp"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "controller-ssh" {
security_group_id = "${aws_security_group.controller.id}"
@ -31,16 +21,6 @@ resource "aws_security_group_rule" "controller-ssh" {
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "controller-apiserver" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 6443
to_port = 6443
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "controller-etcd" {
security_group_id = "${aws_security_group.controller.id}"
@ -51,6 +31,7 @@ resource "aws_security_group_rule" "controller-etcd" {
self = true
}
# Allow Prometheus to scrape etcd metrics
resource "aws_security_group_rule" "controller-etcd-metrics" {
security_group_id = "${aws_security_group.controller.id}"
@ -61,6 +42,16 @@ resource "aws_security_group_rule" "controller-etcd-metrics" {
source_security_group_id = "${aws_security_group.worker.id}"
}
resource "aws_security_group_rule" "controller-apiserver" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 6443
to_port = 6443
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "controller-flannel" {
security_group_id = "${aws_security_group.controller.id}"
@ -81,6 +72,7 @@ resource "aws_security_group_rule" "controller-flannel-self" {
self = true
}
# Allow Prometheus to scrape node-exporter daemonset
resource "aws_security_group_rule" "controller-node-exporter" {
security_group_id = "${aws_security_group.controller.id}"
@ -91,6 +83,7 @@ resource "aws_security_group_rule" "controller-node-exporter" {
source_security_group_id = "${aws_security_group.worker.id}"
}
# Allow apiserver to access kubelets for exec, log, port-forward
resource "aws_security_group_rule" "controller-kubelet" {
security_group_id = "${aws_security_group.controller.id}"
@ -111,6 +104,7 @@ resource "aws_security_group_rule" "controller-kubelet-self" {
self = true
}
# Allow heapster / metrics-server to scrape kubelet read-only
resource "aws_security_group_rule" "controller-kubelet-read" {
security_group_id = "${aws_security_group.controller.id}"
@ -213,16 +207,6 @@ resource "aws_security_group" "worker" {
tags = "${map("Name", "${var.cluster_name}-worker")}"
}
resource "aws_security_group_rule" "worker-icmp" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "icmp"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "worker-ssh" {
security_group_id = "${aws_security_group.worker.id}"
@ -273,6 +257,7 @@ resource "aws_security_group_rule" "worker-flannel-self" {
self = true
}
# Allow Prometheus to scrape node-exporter daemonset
resource "aws_security_group_rule" "worker-node-exporter" {
security_group_id = "${aws_security_group.worker.id}"
@ -293,6 +278,7 @@ resource "aws_security_group_rule" "ingress-health" {
cidr_blocks = ["0.0.0.0/0"]
}
# Allow apiserver to access kubelets for exec, log, port-forward
resource "aws_security_group_rule" "worker-kubelet" {
security_group_id = "${aws_security_group.worker.id}"
@ -303,6 +289,7 @@ resource "aws_security_group_rule" "worker-kubelet" {
source_security_group_id = "${aws_security_group.controller.id}"
}
# Allow Prometheus to scrape kubelet metrics
resource "aws_security_group_rule" "worker-kubelet-self" {
security_group_id = "${aws_security_group.worker.id}"
@ -313,6 +300,7 @@ resource "aws_security_group_rule" "worker-kubelet-self" {
self = true
}
# Allow heapster / metrics-server to scrape kubelet read-only
resource "aws_security_group_rule" "worker-kubelet-read" {
security_group_id = "${aws_security_group.worker.id}"

View File

@ -69,7 +69,7 @@ runcmd:
- [systemctl, daemon-reload]
- [systemctl, restart, NetworkManager]
- [systemctl, enable, cloud-metadata.service]
- "atomic install --system --name=kubelet quay.io/poseidon/kubelet:v1.11.0"
- "atomic install --system --name=kubelet quay.io/poseidon/kubelet:v1.12.1"
- [systemctl, start, --no-block, kubelet.service]
users:
- default

View File

@ -0,0 +1,23 @@
The MIT License (MIT)
Copyright (c) 2017 Typhoon Authors
Copyright (c) 2017 Dalton Hubble
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

View File

@ -0,0 +1,22 @@
# Typhoon <img align="right" src="https://storage.googleapis.com/poseidon/typhoon-logo.png">
Typhoon is a minimal and free Kubernetes distribution.
* Minimal, stable base Kubernetes distribution
* Declarative infrastructure and configuration
* Free (freedom and cost) and privacy-respecting
* Practical for labs, datacenters, and clouds
Typhoon distributes upstream Kubernetes, architectural conventions, and cluster addons, much like a GNU/Linux distribution provides the Linux kernel and userspace components.
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.12.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
## Docs
Please see the [official docs](https://typhoon.psdn.io) and the Azure [tutorial](https://typhoon.psdn.io/cl/azure/).

View File

@ -0,0 +1,13 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=81f19507faabf411db9c760d55f3d03f7d78f4c9"
cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
etcd_servers = ["${formatlist("%s.%s", azurerm_dns_a_record.etcds.*.name, var.dns_zone)}"]
asset_dir = "${var.asset_dir}"
networking = "flannel"
pod_cidr = "${var.pod_cidr}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
}

View File

@ -0,0 +1,163 @@
---
systemd:
units:
- name: etcd-member.service
enable: true
dropins:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.3.10"
Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
Environment="ETCD_LISTEN_CLIENT_URLS=https://0.0.0.0:2379"
Environment="ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380"
Environment="ETCD_LISTEN_METRICS_URLS=http://0.0.0.0:2381"
Environment="ETCD_INITIAL_CLUSTER=${etcd_initial_cluster}"
Environment="ETCD_STRICT_RECONFIG_CHECK=true"
Environment="ETCD_SSL_DIR=/etc/ssl/etcd"
Environment="ETCD_TRUSTED_CA_FILE=/etc/ssl/certs/etcd/server-ca.crt"
Environment="ETCD_CERT_FILE=/etc/ssl/certs/etcd/server.crt"
Environment="ETCD_KEY_FILE=/etc/ssl/certs/etcd/server.key"
Environment="ETCD_CLIENT_CERT_AUTH=true"
Environment="ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/certs/etcd/peer-ca.crt"
Environment="ETCD_PEER_CERT_FILE=/etc/ssl/certs/etcd/peer.crt"
Environment="ETCD_PEER_KEY_FILE=/etc/ssl/certs/etcd/peer.key"
Environment="ETCD_PEER_CLIENT_CERT_AUTH=true"
- name: docker.service
enable: true
- name: locksmithd.service
mask: true
- name: wait-for-dns.service
enable: true
contents: |
[Unit]
Description=Wait for DNS entries
Wants=systemd-resolved.service
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/sh -c 'while ! /usr/bin/grep '^[^#[:space:]]' /etc/resolv.conf > /dev/null; do sleep 1; done'
[Install]
RequiredBy=kubelet.service
RequiredBy=etcd-member.service
- name: kubelet.service
enable: true
contents: |
[Unit]
Description=Kubelet via Hyperkube
Wants=rpc-statd.service
[Service]
EnvironmentFile=/etc/kubernetes/kubelet.env
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/cache/kubelet-pod.uuid \
--volume=resolv,kind=host,source=/etc/resolv.conf \
--mount volume=resolv,target=/etc/resolv.conf \
--volume var-lib-cni,kind=host,source=/var/lib/cni \
--mount volume=var-lib-cni,target=/var/lib/cni \
--volume var-lib-calico,kind=host,source=/var/lib/calico \
--mount volume=var-lib-calico,target=/var/lib/calico \
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
--mount volume=opt-cni-bin,target=/opt/cni/bin \
--volume var-log,kind=host,source=/var/log \
--mount volume=var-log,target=/var/log \
--insecure-options=image"
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/mkdir -p /var/lib/calico
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${k8s_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni \
--node-labels=node-role.kubernetes.io/master \
--node-labels=node-role.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
- name: bootkube.service
contents: |
[Unit]
Description=Bootstrap a Kubernetes cluster
ConditionPathExists=!/opt/bootkube/init_bootkube.done
[Service]
Type=oneshot
RemainAfterExit=true
WorkingDirectory=/opt/bootkube
ExecStart=/opt/bootkube/bootkube-start
ExecStartPost=/bin/touch /opt/bootkube/init_bootkube.done
[Install]
WantedBy=multi-user.target
storage:
files:
- path: /etc/kubernetes/kubeconfig
filesystem: root
mode: 0644
contents:
inline: |
${kubeconfig}
- path: /etc/kubernetes/kubelet.env
filesystem: root
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.12.1
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
inline: |
fs.inotify.max_user_watches=16184
- path: /opt/bootkube/bootkube-start
filesystem: root
mode: 0544
user:
id: 500
group:
id: 500
contents:
inline: |
#!/bin/bash
# Wrapper for bootkube start
set -e
# Move experimental manifests
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.13.0}"
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
exec /usr/bin/rkt run \
--trust-keys-from-https \
--volume assets,kind=host,source=$${BOOTKUBE_ASSETS} \
--mount volume=assets,target=/assets \
--volume bootstrap,kind=host,source=/etc/kubernetes \
--mount volume=bootstrap,target=/etc/kubernetes \
$${RKT_OPTS} \
$${BOOTKUBE_ACI}:$${BOOTKUBE_VERSION} \
--net=host \
--dns=host \
--exec=/bootkube -- start --asset-dir=/assets "$@"
passwd:
users:
- name: core
ssh_authorized_keys:
- "${ssh_authorized_key}"

View File

@ -0,0 +1,163 @@
# Discrete DNS records for each controller's private IPv4 for etcd usage
resource "azurerm_dns_a_record" "etcds" {
count = "${var.controller_count}"
resource_group_name = "${var.dns_zone_group}"
# DNS Zone name where record should be created
zone_name = "${var.dns_zone}"
# DNS record
name = "${format("%s-etcd%d", var.cluster_name, count.index)}"
ttl = 300
# private IPv4 address for etcd
records = ["${element(azurerm_network_interface.controllers.*.private_ip_address, count.index)}"]
}
locals {
# Channel for a Container Linux derivative
# coreos-stable -> Container Linux Stable
channel = "${element(split("-", var.os_image), 1)}"
}
# Controller availability set to spread controllers
resource "azurerm_availability_set" "controllers" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "${var.cluster_name}-controllers"
location = "${var.region}"
platform_fault_domain_count = 2
platform_update_domain_count = 4
managed = true
}
# Controller instances
resource "azurerm_virtual_machine" "controllers" {
count = "${var.controller_count}"
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "${var.cluster_name}-controller-${count.index}"
location = "${var.region}"
availability_set_id = "${azurerm_availability_set.controllers.id}"
vm_size = "${var.controller_type}"
# boot
storage_image_reference {
publisher = "CoreOS"
offer = "CoreOS"
sku = "${local.channel}"
version = "latest"
}
# storage
storage_os_disk {
name = "${var.cluster_name}-controller-${count.index}"
create_option = "FromImage"
caching = "ReadWrite"
disk_size_gb = "${var.disk_size}"
os_type = "Linux"
managed_disk_type = "Premium_LRS"
}
# network
network_interface_ids = ["${element(azurerm_network_interface.controllers.*.id, count.index)}"]
os_profile {
computer_name = "${var.cluster_name}-controller-${count.index}"
admin_username = "core"
custom_data = "${element(data.ct_config.controller-ignitions.*.rendered, count.index)}"
}
# Azure mandates setting an ssh_key, even though Ignition custom_data handles it too
os_profile_linux_config {
disable_password_authentication = true
ssh_keys {
path = "/home/core/.ssh/authorized_keys"
key_data = "${var.ssh_authorized_key}"
}
}
# lifecycle
delete_os_disk_on_termination = true
delete_data_disks_on_termination = true
lifecycle {
ignore_changes = [
"storage_os_disk",
]
}
}
# Controller NICs with public and private IPv4
resource "azurerm_network_interface" "controllers" {
count = "${var.controller_count}"
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "${var.cluster_name}-controller-${count.index}"
location = "${azurerm_resource_group.cluster.location}"
network_security_group_id = "${azurerm_network_security_group.controller.id}"
ip_configuration {
name = "ip0"
subnet_id = "${azurerm_subnet.controller.id}"
private_ip_address_allocation = "dynamic"
# public IPv4
public_ip_address_id = "${element(azurerm_public_ip.controllers.*.id, count.index)}"
# backend address pool to which the NIC should be added
load_balancer_backend_address_pools_ids = ["${azurerm_lb_backend_address_pool.controller.id}"]
}
}
# Controller public IPv4 addresses
resource "azurerm_public_ip" "controllers" {
count = "${var.controller_count}"
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "${var.cluster_name}-controller-${count.index}"
location = "${azurerm_resource_group.cluster.location}"
sku = "Standard"
public_ip_address_allocation = "static"
}
# Controller Ignition configs
data "ct_config" "controller-ignitions" {
count = "${var.controller_count}"
content = "${element(data.template_file.controller-configs.*.rendered, count.index)}"
pretty_print = false
snippets = ["${var.controller_clc_snippets}"]
}
# Controller Container Linux configs
data "template_file" "controller-configs" {
count = "${var.controller_count}"
template = "${file("${path.module}/cl/controller.yaml.tmpl")}"
vars = {
# Cannot use cyclic dependencies on controllers or their DNS records
etcd_name = "etcd${count.index}"
etcd_domain = "${var.cluster_name}-etcd${count.index}.${var.dns_zone}"
# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...
etcd_initial_cluster = "${join(",", data.template_file.etcds.*.rendered)}"
kubeconfig = "${indent(10, module.bootkube.kubeconfig)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
}
}
data "template_file" "etcds" {
count = "${var.controller_count}"
template = "etcd$${index}=https://$${cluster_name}-etcd$${index}.$${dns_zone}:2380"
vars {
index = "${count.index}"
cluster_name = "${var.cluster_name}"
dns_zone = "${var.dns_zone}"
}
}

View File

@ -0,0 +1,144 @@
# DNS record for the apiserver load balancer
resource "azurerm_dns_a_record" "apiserver" {
resource_group_name = "${var.dns_zone_group}"
# DNS Zone name where record should be created
zone_name = "${var.dns_zone}"
# DNS record
name = "${var.cluster_name}"
ttl = 300
# IPv4 address of apiserver load balancer
records = ["${azurerm_public_ip.apiserver-ipv4.ip_address}"]
}
# Static IPv4 address for the apiserver frontend
resource "azurerm_public_ip" "apiserver-ipv4" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "${var.cluster_name}-apiserver-ipv4"
location = "${var.region}"
sku = "Standard"
public_ip_address_allocation = "static"
}
# Static IPv4 address for the ingress frontend
resource "azurerm_public_ip" "ingress-ipv4" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "${var.cluster_name}-ingress-ipv4"
location = "${var.region}"
sku = "Standard"
public_ip_address_allocation = "static"
}
# Network Load Balancer for apiservers and ingress
resource "azurerm_lb" "cluster" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "${var.cluster_name}"
location = "${var.region}"
sku = "Standard"
frontend_ip_configuration {
name = "apiserver"
public_ip_address_id = "${azurerm_public_ip.apiserver-ipv4.id}"
}
frontend_ip_configuration {
name = "ingress"
public_ip_address_id = "${azurerm_public_ip.ingress-ipv4.id}"
}
}
resource "azurerm_lb_rule" "apiserver" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "apiserver"
loadbalancer_id = "${azurerm_lb.cluster.id}"
frontend_ip_configuration_name = "apiserver"
protocol = "Tcp"
frontend_port = 6443
backend_port = 6443
backend_address_pool_id = "${azurerm_lb_backend_address_pool.controller.id}"
probe_id = "${azurerm_lb_probe.apiserver.id}"
}
resource "azurerm_lb_rule" "ingress-http" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "ingress-http"
loadbalancer_id = "${azurerm_lb.cluster.id}"
frontend_ip_configuration_name = "ingress"
protocol = "Tcp"
frontend_port = 80
backend_port = 80
backend_address_pool_id = "${azurerm_lb_backend_address_pool.worker.id}"
probe_id = "${azurerm_lb_probe.ingress.id}"
}
resource "azurerm_lb_rule" "ingress-https" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "ingress-https"
loadbalancer_id = "${azurerm_lb.cluster.id}"
frontend_ip_configuration_name = "ingress"
protocol = "Tcp"
frontend_port = 443
backend_port = 443
backend_address_pool_id = "${azurerm_lb_backend_address_pool.worker.id}"
probe_id = "${azurerm_lb_probe.ingress.id}"
}
# Address pool of controllers
resource "azurerm_lb_backend_address_pool" "controller" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "controller"
loadbalancer_id = "${azurerm_lb.cluster.id}"
}
# Address pool of workers
resource "azurerm_lb_backend_address_pool" "worker" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "worker"
loadbalancer_id = "${azurerm_lb.cluster.id}"
}
# Health checks / probes
# TCP health check for apiserver
resource "azurerm_lb_probe" "apiserver" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "apiserver"
loadbalancer_id = "${azurerm_lb.cluster.id}"
protocol = "Tcp"
port = 6443
# unhealthy threshold
number_of_probes = 3
interval_in_seconds = 5
}
# HTTP health check for ingress
resource "azurerm_lb_probe" "ingress" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "ingress"
loadbalancer_id = "${azurerm_lb.cluster.id}"
protocol = "Http"
port = 10254
request_path = "/healthz"
# unhealthy threshold
number_of_probes = 3
interval_in_seconds = 5
}

View File

@ -0,0 +1,33 @@
# Organize cluster into a resource group
resource "azurerm_resource_group" "cluster" {
name = "${var.cluster_name}"
location = "${var.region}"
}
resource "azurerm_virtual_network" "network" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "${var.cluster_name}"
location = "${azurerm_resource_group.cluster.location}"
address_space = ["${var.host_cidr}"]
}
# Subnets - separate subnets for controller and workers because Azure
# network security groups are based on IPv4 CIDR rather than instance
# tags like GCP or security group membership like AWS
resource "azurerm_subnet" "controller" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "controller"
virtual_network_name = "${azurerm_virtual_network.network.name}"
address_prefix = "${cidrsubnet(var.host_cidr, 1, 0)}"
}
resource "azurerm_subnet" "worker" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "worker"
virtual_network_name = "${azurerm_virtual_network.network.name}"
address_prefix = "${cidrsubnet(var.host_cidr, 1, 1)}"
}

View File

@ -0,0 +1,32 @@
# Outputs for Kubernetes Ingress
output "ingress_static_ipv4" {
value = "${azurerm_public_ip.ingress-ipv4.ip_address}"
description = "IPv4 address of the load balancer for distributing traffic to Ingress controllers"
}
# Outputs for worker pools
output "region" {
value = "${azurerm_resource_group.cluster.location}"
}
output "resource_group_name" {
value = "${azurerm_resource_group.cluster.name}"
}
output "subnet_id" {
value = "${azurerm_subnet.worker.id}"
}
output "security_group_id" {
value = "${azurerm_network_security_group.worker.id}"
}
output "backend_address_pool_id" {
value = "${azurerm_lb_backend_address_pool.worker.id}"
}
output "kubeconfig" {
value = "${module.bootkube.kubeconfig}"
}

View File

@ -0,0 +1,26 @@
# Terraform version and plugin versions
terraform {
required_version = ">= 0.11.0"
}
provider "azurerm" {
version = "~> 1.16"
}
provider "local" {
version = "~> 1.0"
}
provider "null" {
version = "~> 1.0"
}
provider "template" {
version = "~> 1.0"
}
provider "tls" {
version = "~> 1.0"
}

View File

@ -0,0 +1,319 @@
# Controller security group
resource "azurerm_network_security_group" "controller" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "${var.cluster_name}-controller"
location = "${azurerm_resource_group.cluster.location}"
}
resource "azurerm_network_security_rule" "controller-ssh" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-ssh"
network_security_group_name = "${azurerm_network_security_group.controller.name}"
priority = "2000"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "22"
source_address_prefix = "*"
destination_address_prefix = "${azurerm_subnet.controller.address_prefix}"
}
resource "azurerm_network_security_rule" "controller-etcd" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-etcd"
network_security_group_name = "${azurerm_network_security_group.controller.name}"
priority = "2005"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "2379-2380"
source_address_prefix = "${azurerm_subnet.controller.address_prefix}"
destination_address_prefix = "${azurerm_subnet.controller.address_prefix}"
}
# Allow Prometheus to scrape etcd metrics
resource "azurerm_network_security_rule" "controller-etcd-metrics" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-etcd-metrics"
network_security_group_name = "${azurerm_network_security_group.controller.name}"
priority = "2010"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "2381"
source_address_prefix = "${azurerm_subnet.worker.address_prefix}"
destination_address_prefix = "${azurerm_subnet.controller.address_prefix}"
}
resource "azurerm_network_security_rule" "controller-apiserver" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-apiserver"
network_security_group_name = "${azurerm_network_security_group.controller.name}"
priority = "2015"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "6443"
source_address_prefix = "*"
destination_address_prefix = "${azurerm_subnet.controller.address_prefix}"
}
resource "azurerm_network_security_rule" "controller-flannel" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-flannel"
network_security_group_name = "${azurerm_network_security_group.controller.name}"
priority = "2020"
access = "Allow"
direction = "Inbound"
protocol = "Udp"
source_port_range = "*"
destination_port_range = "8472"
source_address_prefixes = ["${azurerm_subnet.controller.address_prefix}", "${azurerm_subnet.worker.address_prefix}"]
destination_address_prefix = "${azurerm_subnet.controller.address_prefix}"
}
# Allow Prometheus to scrape node-exporter daemonset
resource "azurerm_network_security_rule" "controller-node-exporter" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-node-exporter"
network_security_group_name = "${azurerm_network_security_group.controller.name}"
priority = "2025"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "9100"
source_address_prefix = "${azurerm_subnet.worker.address_prefix}"
destination_address_prefix = "${azurerm_subnet.controller.address_prefix}"
}
# Allow apiserver to access kubelet's for exec, log, port-forward
resource "azurerm_network_security_rule" "controller-kubelet" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-kubelet"
network_security_group_name = "${azurerm_network_security_group.controller.name}"
priority = "2030"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "10250"
# allow Prometheus to scrape kubelet metrics too
source_address_prefixes = ["${azurerm_subnet.controller.address_prefix}", "${azurerm_subnet.worker.address_prefix}"]
destination_address_prefix = "${azurerm_subnet.controller.address_prefix}"
}
# Allow heapster / metrics-server to scrape kubelet read-only
resource "azurerm_network_security_rule" "controller-kubelet-read" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-kubelet-read"
network_security_group_name = "${azurerm_network_security_group.controller.name}"
priority = "2035"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "10255"
source_address_prefix = "${azurerm_subnet.worker.address_prefix}"
destination_address_prefix = "${azurerm_subnet.controller.address_prefix}"
}
# Override Azure AllowVNetInBound and AllowAzureLoadBalancerInBound
# https://docs.microsoft.com/en-us/azure/virtual-network/security-overview#default-security-rules
resource "azurerm_network_security_rule" "controller-allow-loadblancer" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-loadbalancer"
network_security_group_name = "${azurerm_network_security_group.controller.name}"
priority = "3000"
access = "Allow"
direction = "Inbound"
protocol = "*"
source_port_range = "*"
destination_port_range = "*"
source_address_prefix = "AzureLoadBalancer"
destination_address_prefix = "*"
}
resource "azurerm_network_security_rule" "controller-deny-all" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "deny-all"
network_security_group_name = "${azurerm_network_security_group.controller.name}"
priority = "3005"
access = "Deny"
direction = "Inbound"
protocol = "*"
source_port_range = "*"
destination_port_range = "*"
source_address_prefix = "*"
destination_address_prefix = "*"
}
# Worker security group
resource "azurerm_network_security_group" "worker" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "${var.cluster_name}-worker"
location = "${azurerm_resource_group.cluster.location}"
}
resource "azurerm_network_security_rule" "worker-ssh" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-ssh"
network_security_group_name = "${azurerm_network_security_group.worker.name}"
priority = "2000"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "22"
source_address_prefix = "${azurerm_subnet.controller.address_prefix}"
destination_address_prefix = "${azurerm_subnet.worker.address_prefix}"
}
resource "azurerm_network_security_rule" "worker-http" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-http"
network_security_group_name = "${azurerm_network_security_group.worker.name}"
priority = "2005"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "80"
source_address_prefix = "*"
destination_address_prefix = "${azurerm_subnet.worker.address_prefix}"
}
resource "azurerm_network_security_rule" "worker-https" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-https"
network_security_group_name = "${azurerm_network_security_group.worker.name}"
priority = "2010"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "443"
source_address_prefix = "*"
destination_address_prefix = "${azurerm_subnet.worker.address_prefix}"
}
resource "azurerm_network_security_rule" "worker-flannel" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-flannel"
network_security_group_name = "${azurerm_network_security_group.worker.name}"
priority = "2015"
access = "Allow"
direction = "Inbound"
protocol = "Udp"
source_port_range = "*"
destination_port_range = "8472"
source_address_prefixes = ["${azurerm_subnet.controller.address_prefix}", "${azurerm_subnet.worker.address_prefix}"]
destination_address_prefix = "${azurerm_subnet.worker.address_prefix}"
}
# Allow Prometheus to scrape node-exporter daemonset
resource "azurerm_network_security_rule" "worker-node-exporter" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-node-exporter"
network_security_group_name = "${azurerm_network_security_group.worker.name}"
priority = "2020"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "9100"
source_address_prefix = "${azurerm_subnet.worker.address_prefix}"
destination_address_prefix = "${azurerm_subnet.worker.address_prefix}"
}
# Allow apiserver to access kubelet's for exec, log, port-forward
resource "azurerm_network_security_rule" "worker-kubelet" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-kubelet"
network_security_group_name = "${azurerm_network_security_group.worker.name}"
priority = "2025"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "10250"
# allow Prometheus to scrape kubelet metrics too
source_address_prefixes = ["${azurerm_subnet.controller.address_prefix}", "${azurerm_subnet.worker.address_prefix}"]
destination_address_prefix = "${azurerm_subnet.worker.address_prefix}"
}
# Allow heapster / metrics-server to scrape kubelet read-only
resource "azurerm_network_security_rule" "worker-kubelet-read" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-kubelet-read"
network_security_group_name = "${azurerm_network_security_group.worker.name}"
priority = "2030"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "10255"
source_address_prefix = "${azurerm_subnet.worker.address_prefix}"
destination_address_prefix = "${azurerm_subnet.worker.address_prefix}"
}
# Override Azure AllowVNetInBound and AllowAzureLoadBalancerInBound
# https://docs.microsoft.com/en-us/azure/virtual-network/security-overview#default-security-rules
resource "azurerm_network_security_rule" "worker-allow-loadblancer" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-loadbalancer"
network_security_group_name = "${azurerm_network_security_group.worker.name}"
priority = "3000"
access = "Allow"
direction = "Inbound"
protocol = "*"
source_port_range = "*"
destination_port_range = "*"
source_address_prefix = "AzureLoadBalancer"
destination_address_prefix = "*"
}
resource "azurerm_network_security_rule" "worker-deny-all" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "deny-all"
network_security_group_name = "${azurerm_network_security_group.worker.name}"
priority = "3005"
access = "Deny"
direction = "Inbound"
protocol = "*"
source_port_range = "*"
destination_port_range = "*"
source_address_prefix = "*"
destination_address_prefix = "*"
}

View File

@ -0,0 +1,95 @@
# Secure copy etcd TLS assets to controllers.
resource "null_resource" "copy-controller-secrets" {
count = "${var.controller_count}"
depends_on = [
"azurerm_virtual_machine.controllers",
]
connection {
type = "ssh"
host = "${element(azurerm_public_ip.controllers.*.ip_address, count.index)}"
user = "core"
timeout = "15m"
}
provisioner "file" {
content = "${module.bootkube.etcd_ca_cert}"
destination = "$HOME/etcd-client-ca.crt"
}
provisioner "file" {
content = "${module.bootkube.etcd_client_cert}"
destination = "$HOME/etcd-client.crt"
}
provisioner "file" {
content = "${module.bootkube.etcd_client_key}"
destination = "$HOME/etcd-client.key"
}
provisioner "file" {
content = "${module.bootkube.etcd_server_cert}"
destination = "$HOME/etcd-server.crt"
}
provisioner "file" {
content = "${module.bootkube.etcd_server_key}"
destination = "$HOME/etcd-server.key"
}
provisioner "file" {
content = "${module.bootkube.etcd_peer_cert}"
destination = "$HOME/etcd-peer.crt"
}
provisioner "file" {
content = "${module.bootkube.etcd_peer_key}"
destination = "$HOME/etcd-peer.key"
}
provisioner "remote-exec" {
inline = [
"sudo mkdir -p /etc/ssl/etcd/etcd",
"sudo mv etcd-client* /etc/ssl/etcd/",
"sudo cp /etc/ssl/etcd/etcd-client-ca.crt /etc/ssl/etcd/etcd/server-ca.crt",
"sudo mv etcd-server.crt /etc/ssl/etcd/etcd/server.crt",
"sudo mv etcd-server.key /etc/ssl/etcd/etcd/server.key",
"sudo cp /etc/ssl/etcd/etcd-client-ca.crt /etc/ssl/etcd/etcd/peer-ca.crt",
"sudo mv etcd-peer.crt /etc/ssl/etcd/etcd/peer.crt",
"sudo mv etcd-peer.key /etc/ssl/etcd/etcd/peer.key",
"sudo chown -R etcd:etcd /etc/ssl/etcd",
"sudo chmod -R 500 /etc/ssl/etcd",
]
}
}
# Secure copy bootkube assets to ONE controller and start bootkube to perform
# one-time self-hosted cluster bootstrapping.
resource "null_resource" "bootkube-start" {
depends_on = [
"module.bootkube",
"module.workers",
"azurerm_dns_a_record.apiserver",
"null_resource.copy-controller-secrets",
]
connection {
type = "ssh"
host = "${element(azurerm_public_ip.controllers.*.ip_address, 0)}"
user = "core"
timeout = "15m"
}
provisioner "file" {
source = "${var.asset_dir}"
destination = "$HOME/assets"
}
provisioner "remote-exec" {
inline = [
"sudo mv $HOME/assets /opt/bootkube",
"sudo systemctl start bootkube",
]
}
}

View File

@ -0,0 +1,117 @@
variable "cluster_name" {
type = "string"
description = "Unique cluster name (prepended to dns_zone)"
}
# Azure
variable "region" {
type = "string"
description = "Azure Region (e.g. centralus , see `az account list-locations --output table`)"
}
variable "dns_zone" {
type = "string"
description = "Azure DNS Zone (e.g. azure.example.com)"
}
variable "dns_zone_group" {
type = "string"
description = "Resource group where the Azure DNS Zone resides (e.g. global)"
}
# instances
variable "controller_count" {
type = "string"
default = "1"
description = "Number of controllers (i.e. masters)"
}
variable "worker_count" {
type = "string"
default = "1"
description = "Number of workers"
}
variable "controller_type" {
type = "string"
default = "Standard_DS1_v2"
description = "Machine type for controllers (see `az vm list-skus --location centralus`)"
}
variable "worker_type" {
type = "string"
default = "Standard_F1"
description = "Machine type for workers (see `az vm list-skus --location centralus`)"
}
variable "os_image" {
type = "string"
default = "coreos-stable"
description = "Channel for a Container Linux derivative (coreos-stable, coreos-beta, coreos-alpha)"
}
variable "disk_size" {
type = "string"
default = "40"
description = "Size of the disk in GB"
}
variable "worker_priority" {
type = "string"
default = "Regular"
description = "Set worker priority to Low to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time."
}
variable "controller_clc_snippets" {
type = "list"
description = "Controller Container Linux Config snippets"
default = []
}
variable "worker_clc_snippets" {
type = "list"
description = "Worker Container Linux Config snippets"
default = []
}
# configuration
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key for user 'core'"
}
variable "asset_dir" {
description = "Path to a directory where generated assets should be placed (contains secrets)"
type = "string"
}
variable "host_cidr" {
description = "CIDR IPv4 range to assign to instances"
type = "string"
default = "10.0.0.0/16"
}
variable "pod_cidr" {
description = "CIDR IPv4 range to assign Kubernetes pods"
type = "string"
default = "10.2.0.0/16"
}
variable "service_cidr" {
description = <<EOD
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns.
EOD
type = "string"
default = "10.3.0.0/16"
}
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = "string"
default = "cluster.local"
}

View File

@ -0,0 +1,23 @@
module "workers" {
source = "workers"
name = "${var.cluster_name}"
# Azure
resource_group_name = "${azurerm_resource_group.cluster.name}"
region = "${azurerm_resource_group.cluster.location}"
subnet_id = "${azurerm_subnet.worker.id}"
security_group_id = "${azurerm_network_security_group.worker.id}"
backend_address_pool_id = "${azurerm_lb_backend_address_pool.worker.id}"
count = "${var.worker_count}"
vm_type = "${var.worker_type}"
os_image = "${var.os_image}"
priority = "${var.worker_priority}"
# configuration
kubeconfig = "${module.bootkube.kubeconfig}"
ssh_authorized_key = "${var.ssh_authorized_key}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
clc_snippets = "${var.worker_clc_snippets}"
}

View File

@ -0,0 +1,121 @@
---
systemd:
units:
- name: docker.service
enable: true
- name: locksmithd.service
mask: true
- name: wait-for-dns.service
enable: true
contents: |
[Unit]
Description=Wait for DNS entries
Wants=systemd-resolved.service
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/sh -c 'while ! /usr/bin/grep '^[^#[:space:]]' /etc/resolv.conf > /dev/null; do sleep 1; done'
[Install]
RequiredBy=kubelet.service
- name: kubelet.service
enable: true
contents: |
[Unit]
Description=Kubelet via Hyperkube
Wants=rpc-statd.service
[Service]
EnvironmentFile=/etc/kubernetes/kubelet.env
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/cache/kubelet-pod.uuid \
--volume=resolv,kind=host,source=/etc/resolv.conf \
--mount volume=resolv,target=/etc/resolv.conf \
--volume var-lib-cni,kind=host,source=/var/lib/cni \
--mount volume=var-lib-cni,target=/var/lib/cni \
--volume var-lib-calico,kind=host,source=/var/lib/calico \
--mount volume=var-lib-calico,target=/var/lib/calico \
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
--mount volume=opt-cni-bin,target=/opt/cni/bin \
--volume var-log,kind=host,source=/var/log \
--mount volume=var-log,target=/var/log \
--insecure-options=image"
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/mkdir -p /var/lib/calico
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${k8s_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni \
--node-labels=node-role.kubernetes.io/node \
--pod-manifest-path=/etc/kubernetes/manifests \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
- name: delete-node.service
enable: true
contents: |
[Unit]
Description=Waiting to delete Kubernetes node on shutdown
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/true
ExecStop=/etc/kubernetes/delete-node
[Install]
WantedBy=multi-user.target
storage:
files:
- path: /etc/kubernetes/kubeconfig
filesystem: root
mode: 0644
contents:
inline: |
${kubeconfig}
- path: /etc/kubernetes/kubelet.env
filesystem: root
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.12.1
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
inline: |
fs.inotify.max_user_watches=16184
- path: /etc/kubernetes/delete-node
filesystem: root
mode: 0744
contents:
inline: |
#!/bin/bash
set -e
exec /usr/bin/rkt run \
--trust-keys-from-https \
--volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \
--insecure-options=image \
docker://k8s.gcr.io/hyperkube:v1.12.1 \
--net=host \
--dns=host \
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname | tr '[:upper:]' '[:lower:]')
passwd:
users:
- name: core
ssh_authorized_keys:
- "${ssh_authorized_key}"

View File

@ -0,0 +1 @@

View File

@ -0,0 +1,91 @@
variable "name" {
type = "string"
description = "Unique name for the worker pool"
}
# Azure
variable "region" {
type = "string"
description = "Must be set to the Azure Region of cluster"
}
variable "resource_group_name" {
type = "string"
description = "Must be set to the resource group name of cluster"
}
variable "subnet_id" {
type = "string"
description = "Must be set to the `worker_subnet_id` output by cluster"
}
variable "security_group_id" {
type = "string"
description = "Must be set to the `worker_security_group_id` output by cluster"
}
variable "backend_address_pool_id" {
type = "string"
description = "Must be set to the `worker_backend_address_pool_id` output by cluster"
}
# instances
variable "count" {
type = "string"
default = "1"
description = "Number of instances"
}
variable "vm_type" {
type = "string"
default = "Standard_F1"
description = "Machine type for instances (see `az vm list-skus --location centralus`)"
}
variable "os_image" {
type = "string"
default = "coreos-stable"
description = "Channel for a Container Linux derivative (coreos-stable, coreos-beta, coreos-alpha)"
}
variable "priority" {
type = "string"
default = "Regular"
description = "Set priority to Low to use reduced cost surplus capacity, with the tradeoff that instances can be evicted at any time."
}
variable "clc_snippets" {
type = "list"
description = "Container Linux Config snippets"
default = []
}
# configuration
variable "kubeconfig" {
type = "string"
description = "Must be set to `kubeconfig` output by cluster"
}
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key for user 'core'"
}
variable "service_cidr" {
description = <<EOD
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns.
EOD
type = "string"
default = "10.3.0.0/16"
}
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = "string"
default = "cluster.local"
}

View File

@ -0,0 +1,112 @@
locals {
# Channel for a Container Linux derivative
# coreos-stable -> Container Linux Stable
channel = "${element(split("-", var.os_image), 1)}"
}
# Workers scale set
resource "azurerm_virtual_machine_scale_set" "workers" {
resource_group_name = "${var.resource_group_name}"
name = "${var.name}-workers"
location = "${var.region}"
single_placement_group = false
sku {
name = "${var.vm_type}"
tier = "standard"
capacity = "${var.count}"
}
# boot
storage_profile_image_reference {
publisher = "CoreOS"
offer = "CoreOS"
sku = "${local.channel}"
version = "latest"
}
# storage
storage_profile_os_disk {
create_option = "FromImage"
caching = "ReadWrite"
os_type = "linux"
managed_disk_type = "Standard_LRS"
}
os_profile {
computer_name_prefix = "${var.name}-worker-"
admin_username = "core"
custom_data = "${element(data.ct_config.worker-ignitions.*.rendered, count.index)}"
}
# Azure mandates setting an ssh_key, even though Ignition custom_data handles it too
os_profile_linux_config {
disable_password_authentication = true
ssh_keys {
path = "/home/core/.ssh/authorized_keys"
key_data = "${var.ssh_authorized_key}"
}
}
# network
network_profile {
name = "nic0"
primary = true
network_security_group_id = "${var.security_group_id}"
ip_configuration {
name = "ip0"
subnet_id = "${var.subnet_id}"
# backend address pool to which the NIC should be added
load_balancer_backend_address_pool_ids = ["${var.backend_address_pool_id}"]
}
}
# lifecycle
priority = "${var.priority}"
upgrade_policy_mode = "Manual"
}
# Scale up or down to maintain desired number, tolerating deallocations.
resource "azurerm_autoscale_setting" "workers" {
resource_group_name = "${var.resource_group_name}"
name = "${var.name}-maintain-desired"
location = "${var.region}"
# autoscale
enabled = true
target_resource_id = "${azurerm_virtual_machine_scale_set.workers.id}"
profile {
name = "default"
capacity {
minimum = "${var.count}"
default = "${var.count}"
maximum = "${var.count}"
}
}
}
# Worker Ignition configs
data "ct_config" "worker-ignitions" {
content = "${data.template_file.worker-configs.rendered}"
pretty_print = false
snippets = ["${var.clc_snippets}"]
}
# Worker Container Linux configs
data "template_file" "worker-configs" {
template = "${file("${path.module}/cl/worker.yaml.tmpl")}"
vars = {
kubeconfig = "${indent(10, var.kubeconfig)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
}
}

0
azure/ignore/.gitkeep Normal file
View File

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.11.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Kubernetes v1.12.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)

View File

@ -1,6 +1,6 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=9e6fc7e697a9d6a0201768def7adfb3f168c56d7"
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=81f19507faabf411db9c760d55f3d03f7d78f4c9"
cluster_name = "${var.cluster_name}"
api_servers = ["${var.k8s_domain_name}"]

View File

@ -7,7 +7,7 @@ systemd:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.3.8"
Environment="ETCD_IMAGE_TAG=v3.3.10"
Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${domain_name}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${domain_name}:2380"
@ -123,7 +123,7 @@ storage:
contents:
inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.11.1
KUBELET_IMAGE_TAG=v1.12.1
- path: /etc/hostname
filesystem: root
mode: 0644
@ -150,7 +150,7 @@ storage:
# Move experimental manifests
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.12.0}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.13.0}"
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
exec /usr/bin/rkt run \
--trust-keys-from-https \
@ -163,8 +163,6 @@ storage:
--net=host \
--dns=host \
--exec=/bootkube -- start --asset-dir=/assets "$@"
networkd:
${networkd_content}
passwd:
users:
- name: core

View File

@ -84,7 +84,7 @@ storage:
contents:
inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.11.1
KUBELET_IMAGE_TAG=v1.12.1
- path: /etc/hostname
filesystem: root
mode: 0644
@ -96,8 +96,6 @@ storage:
contents:
inline: |
fs.inotify.max_user_watches=16184
networkd:
${networkd_content}
passwd:
users:
- name: core

View File

@ -3,7 +3,7 @@ resource "matchbox_group" "install" {
name = "${format("install-%s", element(concat(var.controller_names, var.worker_names), count.index))}"
profile = "${local.flavor == "flatcar" ? element(matchbox_profile.flatcar-install.*.name, count.index) : var.cached_install == "true" ? element(matchbox_profile.cached-container-linux-install.*.name, count.index) : element(matchbox_profile.container-linux-install.*.name, count.index)}"
profile = "${local.flavor == "flatcar" ? var.cached_install == "true" ? element(matchbox_profile.cached-flatcar-linux-install.*.name, count.index) : element(matchbox_profile.flatcar-install.*.name, count.index) : var.cached_install == "true" ? element(matchbox_profile.cached-container-linux-install.*.name, count.index) : element(matchbox_profile.container-linux-install.*.name, count.index)}"
selector {
mac = "${element(concat(var.controller_macs, var.worker_macs), count.index)}"

View File

@ -49,7 +49,7 @@ data "template_file" "container-linux-install-configs" {
}
// Container Linux Install profile (from matchbox /assets cache)
// Note: Admin must have downloaded os_version into matchbox assets.
// Note: Admin must have downloaded os_version into matchbox assets/coreos.
resource "matchbox_profile" "cached-container-linux-install" {
count = "${length(var.controller_names) + length(var.worker_names)}"
name = "${format("%s-cached-container-linux-install-%s", var.cluster_name, element(concat(var.controller_names, var.worker_names), count.index))}"
@ -87,7 +87,7 @@ data "template_file" "cached-container-linux-install-configs" {
ssh_authorized_key = "${var.ssh_authorized_key}"
# profile uses -b baseurl to install from matchbox cache
baseurl_flag = "-b ${var.matchbox_http_endpoint}/assets/coreos"
baseurl_flag = "-b ${var.matchbox_http_endpoint}/assets/${local.flavor}"
}
}
@ -114,11 +114,44 @@ resource "matchbox_profile" "flatcar-install" {
container_linux_config = "${element(data.template_file.container-linux-install-configs.*.rendered, count.index)}"
}
// Flatcar Linux Install profile (from matchbox /assets cache)
// Note: Admin must have downloaded os_version into matchbox assets/flatcar.
resource "matchbox_profile" "cached-flatcar-linux-install" {
count = "${length(var.controller_names) + length(var.worker_names)}"
name = "${format("%s-cached-flatcar-linux-install-%s", var.cluster_name, element(concat(var.controller_names, var.worker_names), count.index))}"
kernel = "/assets/flatcar/${var.os_version}/flatcar_production_pxe.vmlinuz"
initrd = [
"/assets/flatcar/${var.os_version}/flatcar_production_pxe_image.cpio.gz",
]
args = [
"initrd=flatcar_production_pxe_image.cpio.gz",
"flatcar.config.url=${var.matchbox_http_endpoint}/ignition?uuid=$${uuid}&mac=$${mac:hexhyp}",
"flatcar.first_boot=yes",
"console=tty0",
"console=ttyS0",
"${var.kernel_args}",
]
container_linux_config = "${element(data.template_file.cached-container-linux-install-configs.*.rendered, count.index)}"
}
// Kubernetes Controller profiles
resource "matchbox_profile" "controllers" {
count = "${length(var.controller_names)}"
name = "${format("%s-controller-%s", var.cluster_name, element(var.controller_names, count.index))}"
container_linux_config = "${element(data.template_file.controller-configs.*.rendered, count.index)}"
count = "${length(var.controller_names)}"
name = "${format("%s-controller-%s", var.cluster_name, element(var.controller_names, count.index))}"
raw_ignition = "${element(data.ct_config.controller-ignitions.*.rendered, count.index)}"
}
data "ct_config" "controller-ignitions" {
count = "${length(var.controller_names)}"
content = "${element(data.template_file.controller-configs.*.rendered, count.index)}"
pretty_print = false
# Must use direct lookup. Cannot use lookup(map, key) since it only works for flat maps
snippets = ["${local.clc_map[element(var.controller_names, count.index)]}"]
}
data "template_file" "controller-configs" {
@ -133,17 +166,23 @@ data "template_file" "controller-configs" {
k8s_dns_service_ip = "${module.bootkube.kube_dns_service_ip}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
ssh_authorized_key = "${var.ssh_authorized_key}"
# Terraform evaluates both sides regardless and element cannot be used on 0 length lists
networkd_content = "${length(var.controller_networkds) == 0 ? "" : element(concat(var.controller_networkds, list("")), count.index)}"
}
}
// Kubernetes Worker profiles
resource "matchbox_profile" "workers" {
count = "${length(var.worker_names)}"
name = "${format("%s-worker-%s", var.cluster_name, element(var.worker_names, count.index))}"
container_linux_config = "${element(data.template_file.worker-configs.*.rendered, count.index)}"
count = "${length(var.worker_names)}"
name = "${format("%s-worker-%s", var.cluster_name, element(var.worker_names, count.index))}"
raw_ignition = "${element(data.ct_config.worker-ignitions.*.rendered, count.index)}"
}
data "ct_config" "worker-ignitions" {
count = "${length(var.worker_names)}"
content = "${element(data.template_file.worker-configs.*.rendered, count.index)}"
pretty_print = false
# Must use direct lookup. Cannot use lookup(map, key) since it only works for flat maps
snippets = ["${local.clc_map[element(var.worker_names, count.index)]}"]
}
data "template_file" "worker-configs" {
@ -156,8 +195,21 @@ data "template_file" "worker-configs" {
k8s_dns_service_ip = "${module.bootkube.kube_dns_service_ip}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
ssh_authorized_key = "${var.ssh_authorized_key}"
# Terraform evaluates both sides regardless and element cannot be used on 0 length lists
networkd_content = "${length(var.worker_networkds) == 0 ? "" : element(concat(var.worker_networkds, list("")), count.index)}"
}
}
locals {
# Hack to workaround https://github.com/hashicorp/terraform/issues/17251
# Default Container Linux config snippets map every node names to list("\n") so
# all lookups succeed
clc_defaults = "${zipmap(concat(var.controller_names, var.worker_names), chunklist(data.template_file.clc-default-snippets.*.rendered, 1))}"
# Union of the default and user specific snippets, later overrides prior.
clc_map = "${merge(local.clc_defaults, var.clc_snippets)}"
}
// Horrible hack to generate a Terraform list of node count length
data "template_file" "clc-default-snippets" {
count = "${length(var.controller_names) + length(var.worker_names)}"
template = "\n"
}

View File

@ -24,27 +24,39 @@ variable "os_version" {
# Terraform's crude "type system" does not properly support lists of maps so we do this.
variable "controller_names" {
type = "list"
type = "list"
description = "Ordered list of controller names (e.g. [node1])"
}
variable "controller_macs" {
type = "list"
type = "list"
description = "Ordered list of controller identifying MAC addresses (e.g. [52:54:00:a1:9c:ae])"
}
variable "controller_domains" {
type = "list"
type = "list"
description = "Ordered list of controller FQDNs (e.g. [node1.example.com])"
}
variable "worker_names" {
type = "list"
type = "list"
description = "Ordered list of worker names (e.g. [node2, node3])"
}
variable "worker_macs" {
type = "list"
type = "list"
description = "Ordered list of worker identifying MAC addresses (e.g. [52:54:00:b2:2f:86, 52:54:00:c3:61:77])"
}
variable "worker_domains" {
type = "list"
type = "list"
description = "Ordered list of worker FQDNs (e.g. [node2.example.com, node3.example.com])"
}
variable "clc_snippets" {
type = "map"
description = "Map from machine names to lists of Container Linux Config snippets"
default = {}
}
# configuration
@ -129,17 +141,3 @@ variable "kernel_args" {
type = "list"
default = []
}
# unofficial, undocumented, unsupported, temporary
variable "controller_networkds" {
type = "list"
description = "Controller Container Linux config networkd section"
default = []
}
variable "worker_networkds" {
type = "list"
description = "Worker Container Linux config networkd section"
default = []
}

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.11.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Kubernetes v1.12.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)

View File

@ -1,6 +1,6 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=9e6fc7e697a9d6a0201768def7adfb3f168c56d7"
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=81f19507faabf411db9c760d55f3d03f7d78f4c9"
cluster_name = "${var.cluster_name}"
api_servers = ["${var.k8s_domain_name}"]

View File

@ -83,9 +83,9 @@ runcmd:
- [systemctl, daemon-reload]
- [systemctl, restart, NetworkManager]
- [hostnamectl, set-hostname, ${domain_name}]
- "atomic install --system --name=etcd quay.io/poseidon/etcd:v3.3.8"
- "atomic install --system --name=kubelet quay.io/poseidon/kubelet:v1.11.0"
- "atomic install --system --name=bootkube quay.io/poseidon/bootkube:v0.12.0"
- "atomic install --system --name=etcd quay.io/poseidon/etcd:v3.3.10"
- "atomic install --system --name=kubelet quay.io/poseidon/kubelet:v1.12.1"
- "atomic install --system --name=bootkube quay.io/poseidon/bootkube:v0.13.0"
- [systemctl, start, --no-block, etcd.service]
- [systemctl, enable, kubelet.path]
- [systemctl, start, --no-block, kubelet.path]

View File

@ -59,7 +59,7 @@ runcmd:
- [systemctl, daemon-reload]
- [systemctl, restart, NetworkManager]
- [hostnamectl, set-hostname, ${domain_name}]
- "atomic install --system --name=kubelet quay.io/poseidon/kubelet:v1.11.0"
- "atomic install --system --name=kubelet quay.io/poseidon/kubelet:v1.12.1"
- [systemctl, enable, kubelet.path]
- [systemctl, start, --no-block, kubelet.path]
users:

View File

@ -25,27 +25,33 @@ EOD
# Terraform's crude "type system" does not properly support lists of maps so we do this.
variable "controller_names" {
type = "list"
type = "list"
description = "Ordered list of controller names (e.g. [node1])"
}
variable "controller_macs" {
type = "list"
type = "list"
description = "Ordered list of controller identifying MAC addresses (e.g. [52:54:00:a1:9c:ae])"
}
variable "controller_domains" {
type = "list"
type = "list"
description = "Ordered list of controller FQDNs (e.g. [node1.example.com])"
}
variable "worker_names" {
type = "list"
type = "list"
description = "Ordered list of worker names (e.g. [node2, node3])"
}
variable "worker_macs" {
type = "list"
type = "list"
description = "Ordered list of worker identifying MAC addresses (e.g. [52:54:00:b2:2f:86, 52:54:00:c3:61:77])"
}
variable "worker_domains" {
type = "list"
type = "list"
description = "Ordered list of worker FQDNs (e.g. [node2.example.com, node3.example.com])"
}
# configuration

View File

@ -11,9 +11,9 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.11.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Kubernetes v1.12.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
## Docs

View File

@ -1,6 +1,6 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=9e6fc7e697a9d6a0201768def7adfb3f168c56d7"
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=81f19507faabf411db9c760d55f3d03f7d78f4c9"
cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]

View File

@ -7,7 +7,7 @@ systemd:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.3.8"
Environment="ETCD_IMAGE_TAG=v3.3.10"
Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
@ -128,7 +128,7 @@ storage:
contents:
inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.11.1
KUBELET_IMAGE_TAG=v1.12.1
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -149,7 +149,7 @@ storage:
# Move experimental manifests
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.12.0}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.13.0}"
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
exec /usr/bin/rkt run \
--trust-keys-from-https \

View File

@ -98,7 +98,7 @@ storage:
contents:
inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.11.1
KUBELET_IMAGE_TAG=v1.12.1
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -116,7 +116,7 @@ storage:
--volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \
--insecure-options=image \
docker://k8s.gcr.io/hyperkube:v1.11.1 \
docker://k8s.gcr.io/hyperkube:v1.12.1 \
--net=host \
--dns=host \
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)

View File

@ -5,7 +5,7 @@ terraform {
}
provider "digitalocean" {
version = "~> 0.1.2"
version = "~> 1.0"
}
provider "local" {

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.11.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Kubernetes v1.12.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)

View File

@ -1,6 +1,6 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=9e6fc7e697a9d6a0201768def7adfb3f168c56d7"
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=81f19507faabf411db9c760d55f3d03f7d78f4c9"
cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]

View File

@ -89,9 +89,9 @@ bootcmd:
- [modprobe, ip_vs]
runcmd:
- [systemctl, daemon-reload]
- "atomic install --system --name=etcd quay.io/poseidon/etcd:v3.3.8"
- "atomic install --system --name=kubelet quay.io/poseidon/kubelet:v1.11.0"
- "atomic install --system --name=bootkube quay.io/poseidon/bootkube:v0.12.0"
- "atomic install --system --name=etcd quay.io/poseidon/etcd:v3.3.10"
- "atomic install --system --name=kubelet quay.io/poseidon/kubelet:v1.12.1"
- "atomic install --system --name=bootkube quay.io/poseidon/bootkube:v0.13.0"
- [systemctl, start, --no-block, etcd.service]
- [systemctl, enable, cloud-metadata.service]
- [systemctl, enable, kubelet.path]

View File

@ -66,7 +66,7 @@ bootcmd:
runcmd:
- [systemctl, daemon-reload]
- [systemctl, enable, cloud-metadata.service]
- "atomic install --system --name=kubelet quay.io/poseidon/kubelet:v1.11.0"
- "atomic install --system --name=kubelet quay.io/poseidon/kubelet:v1.12.1"
- [systemctl, enable, kubelet.path]
- [systemctl, start, --no-block, kubelet.path]
users:

View File

@ -5,7 +5,7 @@ terraform {
}
provider "digitalocean" {
version = "~> 0.1.2"
version = "~> 1.0"
}
provider "local" {

View File

@ -4,7 +4,7 @@ Nginx Ingress controller pods accept and demultiplex HTTP, HTTPS, TCP, or UDP tr
## AWS
On AWS, a network load balancer (NLB) distributes traffic across a target group of worker nodes running an Ingress controller deployment on host ports 80 and 443. Firewall rules allow traffic to ports 80 and 443. Health check rules ensure only workers with a health Ingress controller receive traffic.
On AWS, a network load balancer (NLB) distributes traffic across a target group of worker nodes running an Ingress controller deployment. Security group rules allow traffic to ports 80 and 443. Health checks ensure only workers with a healthy Ingress controller receive traffic.
Create the Ingress controller deployment, service, RBAC roles, RBAC bindings, default backend, and namespace.
@ -12,15 +12,15 @@ Create the Ingress controller deployment, service, RBAC roles, RBAC bindings, de
kubectl apply -R -f addons/nginx-ingress/aws
```
For each application, add a DNS CNAME resolving to the ELB's DNS record.
For each application, add a DNS CNAME resolving to the NLB's DNS record.
```
app1.example.com -> tempest-ingress.123456.us-west2.elb.amazonaws.com
aap2.example.com -> tempest-ingress.123456.us-west2.elb.amazonaws.com
app2.example.com -> tempest-ingress.123456.us-west2.elb.amazonaws.com
app3.example.com -> tempest-ingress.123456.us-west2.elb.amazonaws.com
```
Find the ELB's DNS name through the console or use the Typhoon module's output `ingress_dns_name`. For example, you might use Terraform to manage a Google Cloud DNS record:
Find the NLB's DNS name through the console or use the Typhoon module's output `ingress_dns_name`. For example, you might use Terraform to manage a Google Cloud DNS record:
```tf
resource "google_dns_record_set" "some-application" {
@ -35,6 +35,70 @@ resource "google_dns_record_set" "some-application" {
}
```
## Azure
On Azure, a load balancer distributes traffic across a backend pool of worker nodes running an Ingress controller deployment. Security group rules allow traffic to ports 80 and 443. Health probes ensure only workers with a healthy Ingress controller receive traffic.
Create the Ingress controller deployment, service, RBAC roles, RBAC bindings, default backend, and namespace.
```
kubectl apply -R -f addons/nginx-ingress/azure
```
For each application, add a DNS record resolving to the load balancer's IPv4 address.
```
app1.example.com -> 11.22.33.44
app2.example.com -> 11.22.33.44
app3.example.com -> 11.22.33.44
```
Find the load balancer's IPv4 address with the Azure console or use the Typhoon module's output `ingress_static_ipv4`. For example, you might use Terraform to manage a Google Cloud DNS record:
```tf
resource "google_dns_record_set" "some-application" {
# DNS zone name
managed_zone = "example-zone"
# DNS record
name = "app.example.com."
type = "A"
ttl = 300
rrdatas = ["${module.azure-ramius.ingress_static_ipv4}"]
}
```
## Bare-Metal
On bare-metal, routing traffic to Ingress controller pods can be done in number of ways.
### Equal-Cost Multi-Path
Create the Ingress controller deployment, service, RBAC roles, RBAC bindings, and default backend. The service should use a fixed ClusterIP (e.g. 10.3.0.12) in the Kubernetes service IPv4 CIDR range.
```
kubectl apply -R -f addons/nginx-ingress/bare-metal
```
There is no need for pods to use host networking or for the ingress service to use NodePort or LoadBalancer. Nodes already proxy packets destined for the service's ClusterIP to node(s) with a pod endpoint.
Configure the network router or load balancer with a static route for the Kubernetes service range and set the next hop to a node. Repeat for each node, as desired, and set the metric (i.e. cost) of each. Finally, DNAT traffic destined for the WAN on ports 80 or 443 to the service's fixed ClusterIP.
For each application, add a DNS record resolving to the WAN(s).
```tf
resource "google_dns_record_set" "some-application" {
# Managed DNS Zone name
managed_zone = "zone-name"
# Name of the DNS record
name = "app.example.com."
type = "A"
ttl = 300
rrdatas = ["SOME-WAN-IP"]
}
```
## Digital Ocean
On Digital Ocean, a DNS A record (e.g. `nemo-workers.example.com`) resolves to each worker[^1] running an Ingress controller DaemonSet on host ports 80 and 443. Firewall rules allow IPv4 and IPv6 traffic to ports 80 and 443.
@ -60,11 +124,11 @@ resource "google_dns_record_set" "some-application" {
}
```
[^1]: Digital Ocean does offers load balancers. We've opted not to use them to keep the Digital Ocean setup simple and cheap for developers.
[^1]: Digital Ocean does offer load balancers. We've opted not to use them to keep the Digital Ocean setup simple and cheap for developers.
## Google Cloud
On Google Cloud, a network load balancer distributes traffic across worker nodes (i.e. a target pool of backends) running an Ingress controller deployment on host ports 80 and 443. Firewall rules allow traffic to ports 80 and 443. Health check rules ensure the target pool only includes worker nodes with a healthy Nginx Ingress controller.
On Google Cloud, a TCP Proxy load balancer distributes traffic across a backend service of worker nodes running an Ingress controller deployment. Firewall rules allow traffic to ports 80 and 443. Health check rules ensure only workers with a healthy Ingress controller receive traffic.
Create the Ingress controller deployment, service, RBAC roles, RBAC bindings, default backend, and namespace.
@ -72,11 +136,11 @@ Create the Ingress controller deployment, service, RBAC roles, RBAC bindings, de
kubectl apply -R -f addons/nginx-ingress/google-cloud
```
For each application, add a DNS record resolving to the network load balancer's IPv4 address.
For each application, add a DNS record resolving to the load balancer's IPv4 address.
```
app1.example.com -> 11.22.33.44
aap2.example.com -> 11.22.33.44
app2.example.com -> 11.22.33.44
app3.example.com -> 11.22.33.44
```
@ -94,28 +158,3 @@ resource "google_dns_record_set" "some-application" {
rrdatas = ["${module.google-cloud-yavin.ingress_static_ipv4}"]
}
```
## Bare-Metal
On bare-metal, routing traffic to Ingress controller pods can be done in number of ways.
### Equal-Cost Multi-Path
Deploy the Nginx Ingress Controller as a deployment. Deploy the service with a fixed ClusterIP (e.g. 10.3.0.12) in the Kubernetes service IPv4 CIDR range. There is no need for a NodePort or for pods to bind host ports. Any node can proxy packets destined for the service's ClusterIP to a node which has a pod endpoint.
Configure the network router or load balancer with a static route for the Kubernetes service range and set the next hop to a node. Repeat for each node and set the metric (i.e. cost) of each. Finally, DNAT traffic destined for the WAN on ports 80 or 443 to the service's fixed ClusterIP.
Add a DNS record resolving to the WAN for each application.
```tf
resource "google_dns_record_set" "some-application" {
# Managed DNS Zone name
managed_zone = "zone-name"
# Name of the DNS record
name = "app.example.com."
type = "A"
ttl = 300
rrdatas = ["SOME-WAN-IP"]
}
```

View File

@ -47,4 +47,4 @@ Visit [127.0.0.1:9090](http://127.0.0.1:9090) to query [expressions](http://127.
<br/>
![Prometheus Alerts](../img/prometheus-alerts.png)
Use [Grafana](/addons/grafana.md) to view or build dashboards that use Prometheus as the datasource.
Use [Grafana](/addons/grafana/) to view or build dashboards that use Prometheus as the datasource.

View File

@ -19,7 +19,7 @@ Clusters are kept to a minimal Kubernetes control plane by offering components l
Container Linux Configs (CLCs) declare how a Container Linux instance's disk should be provisioned on first boot from disk. CLCs define disk partitions, filesystems, files, systemd units, dropins, networkd configs, mount units, raid arrays, and users. Typhoon creates controller and worker instances with base Container Linux Configs to create a minimal, secure Kubernetes cluster on each platform.
Typhoon AWS, Google Cloud, and Digital Ocean give users the ability to provide CLC *snippets* - valid Container Linux Configs that are validated and additively merged into the Typhoon base config during `terraform plan`. This allows advanced host customizations and experimentation.
Typhoon AWS, Azure, bare-metal, DigitalOcean, and Google Cloud support CLC *snippets* - valid Container Linux Configs that are validated and additively merged into the Typhoon base config during `terraform plan`. This allows advanced host customizations and experimentation.
#### Examples
@ -69,7 +69,7 @@ View the Container Linux Config [format](https://coreos.com/os/docs/1576.4.0/con
Write Container Linux Configs *snippets* as files in the repository where you keep Terraform configs for clusters (perhaps in a `clc` or `snippets` subdirectory). You may organize snippets in multiple files as desired, provided they are each valid.
Define an [AWS](https://typhoon.psdn.io/aws/#cluster), [Google Cloud](https://typhoon.psdn.io/google-cloud/#cluster), or [Digital Ocean](https://typhoon.psdn.io/digital-ocean/#cluster) cluster and fill in the optional `controller_clc_snippets` or `worker_clc_snippets` fields.
[AWS](/cl/aws/#cluster), [Azure](/cl/azure/#cluster), [DigitalOcean](/cl/digital-ocean/#cluster), and [Google Cloud](/cl/google-cloud/#cluster) clusters allow populating a list of `controller_clc_snippets` or `worker_clc_snippets`.
```
module "digital-ocean-nemo" {
@ -89,6 +89,29 @@ module "digital-ocean-nemo" {
}
```
[Bare-Metal](/cl/bare-metal/#cluster) clusters allow different Container Linux snippets to be used for each node (since hardware may be heterogeneous). Populate the optional `clc_snippets` map variable with any controller or worker name keys and lists of snippets.
```
module "bare-metal-mercury" {
...
controller_names = ["node1"]
worker_names = [
"node2",
"node3",
]
clc_snippets = {
"node2" = [
"${file("./units/hello.yaml")}"
]
"node3" = [
"${file("./units/world.yaml")}",
"${file("./units/hello.yaml")}",
]
}
...
}
```
Plan the resources to be created.
```
@ -113,7 +136,7 @@ $ terraform apply
Container Linux Configs (and the CoreOS Ignition system) create immutable infrastructure. Disk provisioning is performed only on first boot from disk. That means if you change a snippet used by an instance, Terraform will (correctly) try to destroy and recreate that instance. Be careful!
!!! danger
Destroying and recreating controller instances is destructive! etcd runs on controller instances and stores data there. Do not modify controller snippets. See [blue/green](https://typhoon.psdn.io/topics/maintenance/#upgrades) clusters.
Destroying and recreating controller instances is destructive! etcd runs on controller instances and stores data there. Do not modify controller snippets. See [blue/green](/topics/maintenance/#upgrades) clusters.
### Fedora Atomic

View File

@ -1,11 +1,12 @@
# Worker Pools
Typhoon AWS and Google Cloud allow additional groups of workers to be defined and joined to a cluster. For example, add worker pools of instances with different types, disk sizes, Container Linux channels, or preemptibility modes.
Typhoon AWS, Azure, and Google Cloud allow additional groups of workers to be defined and joined to a cluster. For example, add worker pools of instances with different types, disk sizes, Container Linux channels, or preemptibility modes.
Internal Terraform Modules:
* `aws/container-linux/kubernetes/workers`
* `aws/fedora-atomic/kubernetes/workers`
* `azure/container-linux/kubernetes/workers`
* `google-cloud/container-linux/kubernetes/workers`
* `google-cloud/fedora-atomic/kubernetes/workers`
@ -15,7 +16,7 @@ Create a cluster following the AWS [tutorial](../cl/aws.md#cluster). Define a wo
```tf
module "tempest-worker-pool" {
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes/workers?ref=v1.11.1"
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes/workers?ref=v1.12.1"
providers = {
aws = "aws.default"
@ -31,6 +32,7 @@ module "tempest-worker-pool" {
kubeconfig = "${module.aws-tempest.kubeconfig}"
ssh_authorized_key = "${var.ssh_authorized_key}"
# optional
count = 2
instance_type = "m5.large"
os_image = "coreos-beta"
@ -43,7 +45,7 @@ Apply the change.
terraform apply
```
Verify an auto-scaling group of workers join the cluster within a few minutes.
Verify an auto-scaling group of workers joins the cluster within a few minutes.
### Variables
@ -53,10 +55,10 @@ The AWS internal `workers` module supports a number of [variables](https://githu
| Name | Description | Example |
|:-----|:------------|:--------|
| name | Unique name (distinct from cluster name) | "tempest-m5s" |
| vpc_id | Must be set to `vpc_id` output by cluster | "${module.cluster.vpc_id}" |
| subnet_ids | Must be set to `subnet_ids` output by cluster | "${module.cluster.subnet_ids}" |
| security_groups | Must be set to `worker_security_groups` output by cluster | "${module.cluster.worker_security_groups}" |
| name | Unique name (distinct from cluster name) | "tempest-m5s" |
| kubeconfig | Must be set to `kubeconfig` output by cluster | "${module.cluster.kubeconfig}" |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." |
@ -74,20 +76,90 @@ The AWS internal `workers` module supports a number of [variables](https://githu
Check the list of valid [instance types](https://aws.amazon.com/ec2/instance-types/) or per-region and per-type [spot prices](https://aws.amazon.com/ec2/spot/pricing/).
## Azure
Create a cluster following the Azure [tutorial](../cl/azure.md#cluster). Define a worker pool using the Azure internal `workers` module.
```tf
module "ramius-worker-pool" {
source = "git::https://github.com/poseidon/typhoon//azure/container-linux/kubernetes/workers?ref=v1.12.1"
providers = {
azurerm = "azurerm.default"
}
# Azure
region = "${module.azure-ramius.region}"
resource_group_name = "${module.azure-ramius.resource_group_name}"
subnet_id = "${module.azure-ramius.subnet_id}"
security_group_id = "${module.azure-ramius.security_group_id}"
backend_address_pool_id = "${module.azure-ramius.backend_address_pool_id}"
# configuration
name = "ramius-low-priority"
kubeconfig = "${module.azure-ramius.kubeconfig}"
ssh_authorized_key = "${var.ssh_authorized_key}"
# optional
count = 2
vm_type = "Standard_F4"
priority = "Low"
}
```
Apply the change.
```
terraform apply
```
Verify a scale set of workers joins the cluster within a few minutes.
### Variables
The Azure internal `workers` module supports a number of [variables](https://github.com/poseidon/typhoon/blob/master/azure/container-linux/kubernetes/workers/variables.tf).
#### Required
| Name | Description | Example |
|:-----|:------------|:--------|
| name | Unique name (distinct from cluster name) | "ramius-f4" |
| region | Must be set to `region` output by cluster | "${module.cluster.region}" |
| resource_group_name | Must be set to `resource_group_name` output by cluster | "${module.cluster.resource_group_name}" |
| subnet_id | Must be set to `subnet_id` output by cluster | "${module.cluster.subnet_id}" |
| security_group_id | Must be set to `security_group_id` output by cluster | "${module.cluster.security_group_id}" |
| backend_address_pool_id | Must be set to `backend_address_pool_id` output by cluster | "${module.cluster.backend_address_pool_id}" |
| kubeconfig | Must be set to `kubeconfig` output by cluster | "${module.cluster.kubeconfig}" |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." |
#### Optional
| Name | Description | Default | Example |
|:-----|:------------|:--------|:--------|
| count | Number of instances | 1 | 3 |
| vm_type | Machine type for instances | "Standard_F1" | See below |
| os_image | Channel for a Container Linux derivative | coreos-stable | coreos-stable, coreos-beta, coreos-alpha |
| priority | Set priority to Low to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time | Regular | Low |
| clc_snippets | Container Linux Config snippets | [] | [example](/advanced/customization/#usage) |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by coredns. | "cluster.local" | "k8s.example.com" |
Check the list of valid [machine types](https://azure.microsoft.com/en-us/pricing/details/virtual-machines/linux/) and their [specs](https://docs.microsoft.com/en-us/azure/virtual-machines/linux/sizes-general). Use `az vm list-skus` to get the identifier.
## Google Cloud
Create a cluster following the Google Cloud [tutorial](../cl/google-cloud.md#cluster). Define a worker pool using the Google Cloud internal `workers` module.
```tf
module "yavin-worker-pool" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes/workers?ref=v1.11.1"
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes/workers?ref=v1.12.1"
providers = {
google = "google.default"
}
# Google Cloud
region = "us-central1"
region = "europe-west2"
network = "${module.google-cloud-yavin.network_name}"
cluster_name = "yavin"
@ -96,6 +168,7 @@ module "yavin-worker-pool" {
kubeconfig = "${module.google-cloud-yavin.kubeconfig}"
ssh_authorized_key = "${var.ssh_authorized_key}"
# optional
count = 2
machine_type = "n1-standard-16"
os_image = "coreos-beta"
@ -114,11 +187,11 @@ Verify a managed instance group of workers joins the cluster within a few minute
```
$ kubectl get nodes
NAME STATUS AGE VERSION
yavin-controller-0.c.example-com.internal Ready 6m v1.11.1
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.11.1
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.11.1
yavin-16x-worker-jrbf.c.example-com.internal Ready 3m v1.11.1
yavin-16x-worker-mzdm.c.example-com.internal Ready 3m v1.11.1
yavin-controller-0.c.example-com.internal Ready 6m v1.12.1
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.12.1
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.12.1
yavin-16x-worker-jrbf.c.example-com.internal Ready 3m v1.12.1
yavin-16x-worker-mzdm.c.example-com.internal Ready 3m v1.12.1
```
### Variables
@ -129,13 +202,15 @@ The Google Cloud internal `workers` module supports a number of [variables](http
| Name | Description | Example |
|:-----|:------------|:--------|
| region | Must be set to `region` of cluster | "us-central1" |
| network | Must be set to `network_name` output by cluster | "${module.cluster.network_name}" |
| name | Unique name (distinct from cluster name) | "yavin-16x" |
| region | Region for the worker pool instances. May differ from the cluster's region | "europe-west2" |
| network | Must be set to `network_name` output by cluster | "${module.cluster.network_name}" |
| cluster_name | Must be set to `cluster_name` of cluster | "yavin" |
| kubeconfig | Must be set to `kubeconfig` output by cluster | "${module.cluster.kubeconfig}" |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." |
Check the list of regions [docs](https://cloud.google.com/compute/docs/regions-zones/regions-zones) or with `gcloud compute regions list`.
#### Optional
| Name | Description | Default | Example |

View File

@ -8,7 +8,7 @@ Let's cover the concepts you'll need to get started.
#### Nodes
Cluster nodes provision themselves from a declarative configuration upfront. Nodes run a `kubelet` service and register themselves with the control plane to join the higher order cluster. All nodes run `kube-proxy` and `calico` or `flannel` pods.
All cluster nodes provision themselves from a declarative configuration upfront. Nodes run a `kubelet` service and register themselves with the control plane to join the cluster. All nodes run `kube-proxy` and `calico` or `flannel` pods.
#### Controllers
@ -77,6 +77,7 @@ infra/
└── terraform
└── clusters
├── aws-tempest.tf
├── azure-ramius.tf
├── bare-metal-mercury.tf
├── google-cloud-yavin.tf
├── digital-ocean-nemo.tf

View File

@ -1,6 +1,6 @@
# Operating Systems
Typhoon supports [Container Linux](https://coreos.com/why/) and Fedora [Atomic](https://www.projectatomic.io/) 27. These two operating systems were chosen because they offer:
Typhoon supports [Container Linux](https://coreos.com/why/) and Fedora [Atomic](https://www.projectatomic.io/) 28. These two operating systems were chosen because they offer:
* Minimalism and focus on clustered operation
* Automated and atomic operating system upgrades

View File

@ -3,7 +3,7 @@
!!! danger
Typhoon for Fedora Atomic is alpha. Expect rough edges and changes.
In this tutorial, we'll create a Kubernetes v1.11.1 cluster on AWS with Fedora Atomic.
In this tutorial, we'll create a Kubernetes v1.12.1 cluster on AWS with Fedora Atomic.
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a VPC, gateway, subnets, security groups, controller instances, worker auto-scaling group, network load balancer, and TLS assets. Instances are provisioned on first boot with cloud-init.
@ -24,7 +24,7 @@ $ terraform version
Terraform v0.11.7
```
Read [concepts](../architecture/concepts.md) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
```
cd infra/clusters
@ -83,7 +83,7 @@ Define a Kubernetes cluster using the module `aws/fedora-atomic/kubernetes`.
```tf
module "aws-tempest" {
source = "git::https://github.com/poseidon/typhoon//aws/fedora-atomic/kubernetes?ref=v1.11.1"
source = "git::https://github.com/poseidon/typhoon//aws/fedora-atomic/kubernetes?ref=v1.12.1"
providers = {
aws = "aws.default"
@ -156,9 +156,9 @@ In 5-10 minutes, the Kubernetes cluster will be ready.
$ export KUBECONFIG=/home/user/.secrets/clusters/tempest/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
ip-10-0-12-221 Ready 34m v1.11.1
ip-10-0-19-112 Ready 34m v1.11.1
ip-10-0-4-22 Ready 34m v1.11.1
ip-10-0-12-221 Ready 34m v1.12.1
ip-10-0-19-112 Ready 34m v1.12.1
ip-10-0-4-22 Ready 34m v1.12.1
```
List the pods.
@ -184,7 +184,7 @@ kube-system pod-checkpointer-4kxtl-ip-10-0-12-221 1/1 Running 0
## Going Further
Learn about [maintenance](../topics/maintenance.md) and [addons](../addons/overview.md).
Learn about [maintenance](/topics/maintenance/) and [addons](/addons/overview/).
## Variables

View File

@ -3,7 +3,7 @@
!!! danger
Typhoon for Fedora Atomic is alpha. Expect rough edges and changes.
In this tutorial, we'll network boot and provision a Kubernetes v1.11.1 cluster on bare-metal with Fedora Atomic.
In this tutorial, we'll network boot and provision a Kubernetes v1.12.1 cluster on bare-metal with Fedora Atomic.
First, we'll deploy a [Matchbox](https://github.com/coreos/matchbox) service and setup a network boot environment. Then, we'll declare a Kubernetes cluster using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Fedora Atomic via kickstart, reboot into the disk install, and provision themselves as Kubernetes controllers or workers via cloud-init.
@ -95,7 +95,7 @@ For networks already supporting iPXE clients, you can add a `default.ipxe` confi
chain http://matchbox.foo:8080/boot.ipxe
```
For networks with Ubiquiti Routers, you can [configure the router](/topics/hardware.md#ubiquiti) itself to chainload machines to iPXE and Matchbox.
For networks with Ubiquiti Routers, you can [configure the router](/topics/hardware/#ubiquiti) itself to chainload machines to iPXE and Matchbox.
For a small lab, you may wish to checkout the [quay.io/coreos/dnsmasq](https://quay.io/repository/coreos/dnsmasq) container image and [copy-paste examples](https://github.com/coreos/matchbox/blob/master/Documentation/network-setup.md#coreosdnsmasq).
@ -190,7 +190,7 @@ providers {
}
```
Read [concepts](../architecture/concepts.md) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
```
cd infra/clusters
@ -235,7 +235,7 @@ Define a Kubernetes cluster using the module `bare-metal/fedora-atomic/kubernete
```tf
module "bare-metal-mercury" {
source = "git::https://github.com/poseidon/typhoon//bare-metal/fedora-atomic/kubernetes?ref=v1.11.1"
source = "git::https://github.com/poseidon/typhoon//bare-metal/fedora-atomic/kubernetes?ref=v1.12.1"
providers = {
local = "local.default"
@ -361,9 +361,9 @@ bootkube[5]: Tearing down temporary bootstrap control plane...
$ export KUBECONFIG=/home/user/.secrets/clusters/mercury/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
node1.example.com Ready 11m v1.11.1
node2.example.com Ready 11m v1.11.1
node3.example.com Ready 11m v1.11.1
node1.example.com Ready 11m v1.12.1
node2.example.com Ready 11m v1.12.1
node3.example.com Ready 11m v1.12.1
```
List the pods.
@ -389,7 +389,7 @@ kube-system pod-checkpointer-wf65d-node1.example.com 1/1 Running 0
## Going Further
Learn about [maintenance](../topics/maintenance.md) and [addons](../addons/overview.md).
Learn about [maintenance](/topics/maintenance/) and [addons](/addons/overview/).
## Variables

View File

@ -3,7 +3,7 @@
!!! danger
Typhoon for Fedora Atomic is alpha. Expect rough edges and changes.
In this tutorial, we'll create a Kubernetes v1.11.1 cluster on DigitalOcean with Fedora Atomic.
In this tutorial, we'll create a Kubernetes v1.12.1 cluster on DigitalOcean with Fedora Atomic.
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create controller droplets, worker droplets, DNS records, tags, and TLS assets. Instances are provisioned on first boot with cloud-init.
@ -24,7 +24,7 @@ $ terraform version
Terraform v0.11.7
```
Read [concepts](../architecture/concepts.md) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
```
cd infra/clusters
@ -45,7 +45,7 @@ Configure the DigitalOcean provider to use your token in a `providers.tf` file.
```tf
provider "digitalocean" {
version = "0.1.3"
version = "1.0.0"
token = "${chomp(file("~/.config/digital-ocean/token"))}"
alias = "default"
}
@ -77,7 +77,7 @@ Define a Kubernetes cluster using the module `digital-ocean/fedora-atomic/kubern
```tf
module "digital-ocean-nemo" {
source = "git::https://github.com/poseidon/typhoon//digital-ocean/fedora-atomic/kubernetes?ref=v1.11.1"
source = "git::https://github.com/poseidon/typhoon//digital-ocean/fedora-atomic/kubernetes?ref=v1.12.1"
providers = {
digitalocean = "digitalocean.default"
@ -152,9 +152,9 @@ In 3-6 minutes, the Kubernetes cluster will be ready.
$ export KUBECONFIG=/home/user/.secrets/clusters/nemo/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
10.132.110.130 Ready 10m v1.11.1
10.132.115.81 Ready 10m v1.11.1
10.132.124.107 Ready 10m v1.11.1
10.132.110.130 Ready 10m v1.12.1
10.132.115.81 Ready 10m v1.12.1
10.132.124.107 Ready 10m v1.12.1
```
List the pods.
@ -179,7 +179,7 @@ kube-system pod-checkpointer-pr1lq-10.132.115.81 1/1 Running 0
## Going Further
Learn about [maintenance](../topics/maintenance.md) and [addons](../addons/overview.md).
Learn about [maintenance](/topics/maintenance/) and [addons](/addons/overview/).
## Variables

View File

@ -3,7 +3,7 @@
!!! danger
Typhoon for Fedora Atomic is alpha. Fedora does not publish official images for Google Cloud so you must prepare them yourself. Expect rough edges and changes.
In this tutorial, we'll create a Kubernetes v1.11.1 cluster on Google Compute Engine with Fedora Atomic.
In this tutorial, we'll create a Kubernetes v1.12.1 cluster on Google Compute Engine with Fedora Atomic.
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a network, firewall rules, health checks, controller instances, worker managed instance group, load balancers, and TLS assets. Instances are provisioned on first boot with cloud-init.
@ -12,7 +12,7 @@ Controllers are provisioned to run an `etcd` peer and a `kubelet` service. Worke
## Requirements
* Google Cloud Account and Service Account
* Google Cloud DNS Zone (registered Domain Name or delegated subdomain)
* Google Cloud DNS Zone (registered main Name or delegated subdomain)
* Terraform v0.11.x installed locally
* `gcloud` and `gsutil` for uploading a disk image to Google Cloud (temporary)
@ -25,7 +25,7 @@ $ terraform version
Terraform v0.11.7
```
Read [concepts](../architecture/concepts.md) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
```
cd infra/clusters
@ -121,7 +121,7 @@ Define a Kubernetes cluster using the module `google-cloud/fedora-atomic/kuberne
```tf
module "google-cloud-yavin" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-atomic/kubernetes?ref=v1.11.1"
source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-atomic/kubernetes?ref=v1.12.1"
providers = {
google = "google.default"
@ -197,9 +197,9 @@ In 5-10 minutes, the Kubernetes cluster will be ready.
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
yavin-controller-0.c.example-com.internal Ready 6m v1.11.1
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.11.1
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.11.1
yavin-controller-0.c.example-com.internal Ready 6m v1.12.1
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.12.1
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.12.1
```
List the pods.
@ -224,7 +224,7 @@ kube-system pod-checkpointer-l6lrt 1/1 Running 0
## Going Further
Learn about [maintenance](../topics/maintenance.md) and [addons](../addons/overview.md).
Learn about [maintenance](/topics/maintenance/) and [addons](/addons/overview/).
## Variables

View File

@ -1,6 +1,6 @@
# AWS
In this tutorial, we'll create a Kubernetes v1.11.1 cluster on AWS with Container Linux.
In this tutorial, we'll create a Kubernetes v1.12.1 cluster on AWS with Container Linux.
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a VPC, gateway, subnets, security groups, controller instances, worker auto-scaling group, network load balancer, and TLS assets.
@ -18,7 +18,7 @@ Install [Terraform](https://www.terraform.io/downloads.html) v0.11.x on your sys
```sh
$ terraform version
Terraform v0.11.1
Terraform v0.11.7
```
Add the [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) plugin binary for your system.
@ -37,7 +37,7 @@ providers {
}
```
Read [concepts](../architecture/concepts.md) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
```
cd infra/clusters
@ -96,7 +96,7 @@ Define a Kubernetes cluster using the module `aws/container-linux/kubernetes`.
```tf
module "aws-tempest" {
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes?ref=v1.11.1"
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes?ref=v1.12.1"
providers = {
aws = "aws.default"
@ -169,9 +169,9 @@ In 4-8 minutes, the Kubernetes cluster will be ready.
$ export KUBECONFIG=/home/user/.secrets/clusters/tempest/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
ip-10-0-12-221 Ready 34m v1.11.1
ip-10-0-19-112 Ready 34m v1.11.1
ip-10-0-4-22 Ready 34m v1.11.1
ip-10-0-12-221 Ready 34m v1.12.1
ip-10-0-19-112 Ready 34m v1.12.1
ip-10-0-4-22 Ready 34m v1.12.1
```
List the pods.
@ -197,7 +197,7 @@ kube-system pod-checkpointer-4kxtl-ip-10-0-12-221 1/1 Running 0
## Going Further
Learn about [maintenance](../topics/maintenance.md) and [addons](../addons/overview.md).
Learn about [maintenance](/topics/maintenance) and [addons](/addons/overview).
!!! note
On Container Linux clusters, install the `CLUO` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
@ -245,8 +245,8 @@ Reference the DNS zone id with `"${aws_route53_zone.zone-for-clusters.zone_id}"`
| disk_size | Size of the EBS volume in GB | "40" | "100" |
| disk_type | Type of the EBS volume | "gp2" | standard, gp2, io1 |
| worker_price | Spot price in USD for workers. Leave as default empty string for regular on-demand instances | "" | "0.10" |
| controller_clc_snippets | Controller Container Linux Config snippets | [] | |
| worker_clc_snippets | Worker Container Linux Config snippets | [] | |
| controller_clc_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/) |
| worker_clc_snippets | Worker Container Linux Config snippets | [] | [example](/advanced/customization/) |
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
| network_mtu | CNI interface MTU (calico only) | 1480 | 8981 |
| host_cidr | CIDR IPv4 range to assign to EC2 instances | "10.0.0.0/16" | "10.1.0.0/16" |

277
docs/cl/azure.md Normal file
View File

@ -0,0 +1,277 @@
# Azure
!!! danger
Typhoon for Azure is alpha. For production, use AWS, Google Cloud, or bare-metal. As Azure matures, check [errata](https://github.com/poseidon/typhoon/wiki/Errata) for known shortcomings.
In this tutorial, we'll create a Kubernetes v1.12.1 cluster on Azure with Container Linux.
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a resource group, virtual network, subnets, security groups, controller availability set, worker scale set, load balancer, and TLS assets.
Controllers are provisioned to run an `etcd-member` peer and a `kubelet` service. Workers run just a `kubelet` service. A one-time [bootkube](https://github.com/kubernetes-incubator/bootkube) bootstrap schedules the `apiserver`, `scheduler`, `controller-manager`, and `coredns` on controllers and schedules `kube-proxy` and `flannel` on every node. A generated `kubeconfig` provides `kubectl` access to the cluster.
## Requirements
* Azure account
* Azure DNS Zone (registered Domain Name or delegated subdomain)
* Terraform v0.11.x and [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) installed locally
## Terraform Setup
Install [Terraform](https://www.terraform.io/downloads.html) v0.11.x on your system.
```sh
$ terraform version
Terraform v0.11.7
```
Add the [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) plugin binary for your system.
```sh
wget https://github.com/coreos/terraform-provider-ct/releases/download/v0.2.1/terraform-provider-ct-v0.2.1-linux-amd64.tar.gz
tar xzf terraform-provider-ct-v0.2.1-linux-amd64.tar.gz
sudo mv terraform-provider-ct-v0.2.1-linux-amd64/terraform-provider-ct /usr/local/bin/
```
Add the plugin to your `~/.terraformrc`.
```
providers {
ct = "/usr/local/bin/terraform-provider-ct"
}
```
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
```
cd infra/clusters
```
## Provider
[Install](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest) the Azure `az` command line tool to [authenticate with Azure](https://www.terraform.io/docs/providers/azurerm/authenticating_via_azure_cli.html).
```
az login
```
Configure the Azure provider in a `providers.tf` file.
```tf
provider "azurerm" {
version = "1.16.0"
alias = "default"
}
provider "local" {
version = "~> 1.0"
alias = "default"
}
provider "null" {
version = "~> 1.0"
alias = "default"
}
provider "template" {
version = "~> 1.0"
alias = "default"
}
provider "tls" {
version = "~> 1.0"
alias = "default"
}
```
Additional configuration options are described in the `azurerm` provider [docs](https://www.terraform.io/docs/providers/azurerm/).
## Cluster
Define a Kubernetes cluster using the module `azure/container-linux/kubernetes`.
```tf
module "azure-ramius" {
source = "git::https://github.com/poseidon/typhoon//azure/container-linux/kubernetes?ref=v1.12.1"
providers = {
azurerm = "azurerm.default"
local = "local.default"
null = "null.default"
template = "template.default"
tls = "tls.default"
}
# Azure
cluster_name = "ramius"
region = "centralus"
dns_zone = "azure.example.com"
dns_zone_group = "example-group"
# configuration
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
asset_dir = "/home/user/.secrets/clusters/ramius"
# optional
worker_count = 3
host_cidr = "10.0.0.0/20"
}
```
Reference the [variables docs](#variables) or the [variables.tf](https://github.com/poseidon/typhoon/blob/master/azure/container-linux/kubernetes/variables.tf) source.
## ssh-agent
Initial bootstrapping requires `bootkube.service` be started on one controller node. Terraform uses `ssh-agent` to automate this step. Add your SSH private key to `ssh-agent`.
```sh
ssh-add ~/.ssh/id_rsa
ssh-add -L
```
## Apply
Initialize the config directory if this is the first use with Terraform.
```sh
terraform init
```
Plan the resources to be created.
```sh
$ terraform plan
Plan: 86 to add, 0 to change, 0 to destroy.
```
Apply the changes to create the cluster.
```sh
$ terraform apply
...
module.azure-ramius.null_resource.bootkube-start: Still creating... (6m50s elapsed)
module.azure-ramius.null_resource.bootkube-start: Still creating... (7m0s elapsed)
module.azure-ramius.null_resource.bootkube-start: Creation complete after 7m8s (ID: 3961816482286168143)
Apply complete! Resources: 86 added, 0 changed, 0 destroyed.
```
In 4-8 minutes, the Kubernetes cluster will be ready.
## Verify
[Install kubectl](https://coreos.com/kubernetes/docs/latest/configure-kubectl.html) on your system. Use the generated `kubeconfig` credentials to access the Kubernetes cluster and list nodes.
```
$ export KUBECONFIG=/home/user/.secrets/clusters/ramius/auth/kubeconfig
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ramius-controller-0 Ready controller,master 24m v1.12.1
ramius-worker-000001 Ready node 25m v1.12.1
ramius-worker-000002 Ready node 24m v1.12.1
ramius-worker-000005 Ready node 24m v1.12.1
```
List the pods.
```
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-7c6fbb4f4b-b6qzx 1/1 Running 0 26m
kube-system kube-apiserver-hxgsx 1/1 Running 3 26m
kube-system kube-controller-manager-5ff9cd7bb6-b942n 1/1 Running 0 26m
kube-system kube-controller-manager-5ff9cd7bb6-bbr6w 1/1 Running 0 26m
kube-system kube-flannel-bwf24 2/2 Running 2 26m
kube-system kube-flannel-ks5qb 2/2 Running 0 26m
kube-system kube-flannel-nghsx 2/2 Running 2 26m
kube-system kube-flannel-tq2wg 2/2 Running 0 26m
kube-system kube-proxy-j4vpq 1/1 Running 0 26m
kube-system kube-proxy-jxr5d 1/1 Running 0 26m
kube-system kube-proxy-lbdw5 1/1 Running 0 26m
kube-system kube-proxy-v8r7c 1/1 Running 0 26m
kube-system kube-scheduler-5f76d69686-s4fbx 1/1 Running 0 26m
kube-system kube-scheduler-5f76d69686-vgdgn 1/1 Running 0 26m
kube-system pod-checkpointer-cnqdg 1/1 Running 0 26m
kube-system pod-checkpointer-cnqdg-ramius-controller-0 1/1 Running 0 25m
```
## Going Further
Learn about [maintenance](/topics/maintenance/) and [addons](/addons/overview/).
!!! note
On Container Linux clusters, install the `CLUO` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
## Variables
Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/azure/container-linux/kubernetes/variables.tf) source.
### Required
| Name | Description | Example |
|:-----|:------------|:--------|
| cluster_name | Unique cluster name (prepended to dns_zone) | "ramius" |
| region | Azure region | "centralus" |
| dns_zone | Azure DNS zone | "azure.example.com" |
| dns_zone_group | Resource group where the Azure DNS zone resides | "global" |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." |
| asset_dir | Path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/ramius" |
!!! tip
Regions are shown in [docs](https://azure.microsoft.com/en-us/global-infrastructure/regions/) or with `az account list-locations --output table`.
#### DNS Zone
Clusters create a DNS A record `${cluster_name}.${dns_zone}` to resolve a load balancer backed by controller instances. This FQDN is used by workers and `kubectl` to access the apiserver(s). In this example, the cluster's apiserver would be accessible at `ramius.azure.example.com`.
You'll need a registered domain name or delegated subdomain on Azure DNS. You can set this up once and create many clusters with unique names.
```tf
# Azure resource group for DNS zone
resource "azurerm_resource_group" "global" {
name = "global"
location = "centralus"
}
# DNS zone for clusters
resource "azurerm_dns_zone" "clusters" {
resource_group_name = "${azurerm_resource_group.global.name}"
name = "azure.example.com"
zone_type = "Public"
}
```
Reference the DNS zone with `"${azurerm_dns_zone.clusters.name}"` and its resource group with `"${azurerm_resource_group.global.name}"`.
!!! tip ""
If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on Azure DNS (e.g. azure.mydomain.com) and [update nameservers](https://docs.microsoft.com/en-us/azure/dns/dns-delegate-domain-azure-dns).
### Optional
| Name | Description | Default | Example |
|:-----|:------------|:--------|:--------|
| controller_count | Number of controllers (i.e. masters) | 1 | 1 |
| worker_count | Number of workers | 1 | 3 |
| controller_type | Machine type for controllers | "Standard_DS1_v2" | See below |
| worker_type | Machine type for workers | "Standard_F1" | See below |
| os_image | Channel for a Container Linux derivative | coreos-stable | coreos-stable, coreos-beta, coreos-alpha |
| disk_size | Size of the disk GB | "40" | "100" |
| worker_priority | Set priority to Low to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time | Regular | Low |
| controller_clc_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/#usage) |
| worker_clc_snippets | Worker Container Linux Config snippets | [] | [example](/advanced/customization/#usage) |
| host_cidr | CIDR IPv4 range to assign to instances | "10.0.0.0/16" | "10.0.0.0/20" |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by coredns. | "cluster.local" | "k8s.example.com" |
Check the list of valid [machine types](https://azure.microsoft.com/en-us/pricing/details/virtual-machines/linux/) and their [specs](https://docs.microsoft.com/en-us/azure/virtual-machines/linux/sizes-general). Use `az vm list-skus` to get the identifier.
!!! warning
Unlike AWS and GCP, Azure requires its *virtual* networks to have non-overlapping IPv4 CIDRs (yeah, go figure). Instead of each cluster just using `10.0.0.0/16` for instances, each Azure cluster's `host_cidr` must be non-overlapping (e.g. 10.0.0.0/20 for the 1st cluster, 10.0.16.0/20 for the 2nd cluster, etc).
!!! warning
Do not choose a `controller_type` smaller than `Standard_DS1_v2`. Smaller instances are not sufficient for running a controller.
#### Low Priority
Add `worker_priority=Low` to use [Low Priority](https://docs.microsoft.com/en-us/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-use-low-priority) workers that run on Azure's surplus capacity at lower cost, but with the tradeoff that they can be deallocated at random. Low priority VMs are Azure's analog to AWS spot instances or GCP premptible instances.

View File

@ -1,6 +1,6 @@
# Bare-Metal
In this tutorial, we'll network boot and provision a Kubernetes v1.11.1 cluster on bare-metal with Container Linux.
In this tutorial, we'll network boot and provision a Kubernetes v1.12.1 cluster on bare-metal with Container Linux.
First, we'll deploy a [Matchbox](https://github.com/coreos/matchbox) service and setup a network boot environment. Then, we'll declare a Kubernetes cluster using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Container Linux to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers via Ignition.
@ -12,7 +12,7 @@ Controllers are provisioned to run an `etcd-member` peer and a `kubelet` service
* PXE-enabled [network boot](https://coreos.com/matchbox/docs/latest/network-setup.html) environment
* Matchbox v0.6+ deployment with API enabled
* Matchbox credentials `client.crt`, `client.key`, `ca.crt`
* Terraform v0.11.x and [terraform-provider-matchbox](https://github.com/coreos/terraform-provider-matchbox) installed locally
* Terraform v0.11.x, [terraform-provider-matchbox](https://github.com/coreos/terraform-provider-matchbox), and [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) installed locally
## Machines
@ -91,7 +91,7 @@ For networks already supporting iPXE clients, you can add a `default.ipxe` confi
chain http://matchbox.foo:8080/boot.ipxe
```
For networks with Ubiquiti Routers, you can [configure the router](/topics/hardware.md#ubiquiti) itself to chainload machines to iPXE and Matchbox.
For networks with Ubiquiti Routers, you can [configure the router](/topics/hardware/#ubiquiti) itself to chainload machines to iPXE and Matchbox.
For a small lab, you may wish to checkout the [quay.io/coreos/dnsmasq](https://quay.io/repository/coreos/dnsmasq) container image and [copy-paste examples](https://github.com/coreos/matchbox/blob/master/Documentation/network-setup.md#coreosdnsmasq).
@ -110,7 +110,7 @@ Install [Terraform](https://www.terraform.io/downloads.html) v0.11.x on your sys
```sh
$ terraform version
Terraform v0.11.1
Terraform v0.11.7
```
Add the [terraform-provider-matchbox](https://github.com/coreos/terraform-provider-matchbox) plugin binary for your system.
@ -121,6 +121,14 @@ tar xzf terraform-provider-matchbox-v0.2.2-linux-amd64.tar.gz
sudo mv terraform-provider-matchbox-v0.2.2-linux-amd64/terraform-provider-matchbox /usr/local/bin/
```
Add the [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) plugin binary for your system.
```sh
wget https://github.com/coreos/terraform-provider-ct/releases/download/v0.2.1/terraform-provider-ct-v0.2.1-linux-amd64.tar.gz
tar xzf terraform-provider-ct-v0.2.1-linux-amd64.tar.gz
sudo mv terraform-provider-ct-v0.2.1-linux-amd64/terraform-provider-ct /usr/local/bin/
```
Add the plugin to your `~/.terraformrc`.
```
@ -129,7 +137,7 @@ providers {
}
```
Read [concepts](../architecture/concepts.md) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`).
```
cd infra/clusters
@ -174,7 +182,7 @@ Define a Kubernetes cluster using the module `bare-metal/container-linux/kuberne
```tf
module "bare-metal-mercury" {
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.11.1"
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.12.1"
providers = {
local = "local.default"
@ -283,9 +291,9 @@ Apply complete! Resources: 55 added, 0 changed, 0 destroyed.
To watch the install to disk (until machines reboot from disk), SSH to port 2222.
```
# before v1.11.1
# before v1.12.1
$ ssh debug@node1.example.com
# after v1.11.1
# after v1.12.1
$ ssh -p 2222 core@node1.example.com
```
@ -310,9 +318,9 @@ bootkube[5]: Tearing down temporary bootstrap control plane...
$ export KUBECONFIG=/home/user/.secrets/clusters/mercury/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
node1.example.com Ready 11m v1.11.1
node2.example.com Ready 11m v1.11.1
node3.example.com Ready 11m v1.11.1
node1.example.com Ready 11m v1.12.1
node2.example.com Ready 11m v1.12.1
node3.example.com Ready 11m v1.12.1
```
List the pods.
@ -338,7 +346,7 @@ kube-system pod-checkpointer-wf65d-node1.example.com 1/1 Running 0
## Going Further
Learn about [maintenance](../topics/maintenance.md) and [addons](../addons/overview.md).
Learn about [maintenance](/topics/maintenance/) and [addons](/addons/overview/).
!!! note
On Container Linux clusters, install the `CLUO` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
@ -369,10 +377,11 @@ Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/bare-me
| Name | Description | Default | Example |
|:-----|:------------|:--------|:--------|
| cached_install | Whether machines should PXE boot and install from the Matchbox `/assets` cache. Admin MUST have downloaded Container Linux images into the cache to use this (coreos only for now) | false | true |
| cached_install | PXE boot and install from the Matchbox `/assets` cache. Admin MUST have downloaded Container Linux or Flatcar images into the cache | false | true |
| install_disk | Disk device where Container Linux should be installed | "/dev/sda" | "/dev/sdb" |
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
| network_mtu | CNI interface MTU (calico-only) | 1480 | - |
| clc_snippets | Map from machine names to lists of Container Linux Config snippets | {} | [example](/advanced/customization/#usage) |
| network_ip_autodetection_method | Method to detect host IPv4 address (calico-only) | first-found | can-reach=10.0.0.1 |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |

Some files were not shown because too many files have changed in this diff Show More