Compare commits

...

75 Commits

Author SHA1 Message Date
642f7ec22f Update CHANGES.md with Kubernetes link 2018-03-30 23:12:38 -07:00
1cc043d1eb Update Kubernetes from v1.9.6 to v1.10.0 2018-03-30 22:14:07 -07:00
f8e9bfb1c0 Add disk_type variable for EBS volume type on AWS
* Change EBS volume type from `standard` ("prior generation)
 to `gp2`. Prometheus alerts are tuned for SSDs
* Other platforms have fast enough disks by default
2018-03-29 22:51:54 -07:00
b1e41dcb99 addons: Update from Grafana v4.6.3 to v5.0.4
This reverts commit c59a9c66b1.
2018-03-28 19:45:19 -07:00
de4d90750e Use consistent naming of remote provision steps 2018-03-26 00:29:57 -07:00
7acd4931f6 Remove redundant kubeconfig copy on AWS and GCP
* AWS and Google Cloud make use of auto-scaling groups
and managed instance groups, respectively. As such, the
kubeconfig is already held in cloud user-data
* Controller instances are provisioned with a kubeconfig
from user-data. Its redundant to use a Terraform remote
file copy step for the kubeconfig.
2018-03-26 00:01:47 -07:00
cfd603bea2 Ensure etcd secrets are only distributed to controller hosts
* Previously, etcd secrets were erroneously distributed to worker
nodes (permissions 500, ownership etc:etcd).
2018-03-25 23:46:44 -07:00
fdb543e834 Add optional controller_type and worker_type vars on GCP
* Remove optional machine_type variable on Google Cloud
* Use controller_type and worker_type instead
2018-03-25 22:11:18 -07:00
8d3d4220fd Add disk_size variable on Google Cloud 2018-03-25 22:04:14 -07:00
ba9daf439e Remove unmaintained pxe-worker internal module 2018-03-25 21:57:52 -07:00
38adb14bd2 Remove optional variable networking on Digital Ocean
* Calico isn't viable on Digital Ocean because their firewalls
do not support IP-IP protocol. Its not viable to run a cluster
without firewalls just to use Calico.
* Remove the caveat note. Don't allow users to shoot themselves
in the foot
2018-03-25 21:48:51 -07:00
e43cf9f608 Organize and cleanup variable descriptions 2018-03-25 21:44:43 -07:00
455a4af27e Improve cluster definition examples in docs 2018-03-25 20:41:52 -07:00
39876e455f Fix docs to reflect enforced provider versions 2018-03-25 11:34:39 -07:00
da2be86e8c Add v1.9.6 heading to CHANGES.md 2018-03-22 22:01:29 -07:00
65a2751f77 addons: Update heapster from v1.5.1 to v1.5.2
* https://github.com/kubernetes/heapster/releases/tag/v1.5.2
2018-03-21 20:32:01 -07:00
a04ef3919a Update Kubernetes from v1.9.5 to v1.9.6 2018-03-21 20:29:52 -07:00
851bc1a3f8 Update nginx-ingress from 0.11.0 to 0.12.0 2018-03-19 23:17:17 -07:00
758c09fa5c Update Kubernetes from v1.9.4 to v1.9.5 2018-03-19 00:25:44 -07:00
b1cdd361ef Mention controllers node label in changelog 2018-03-19 00:15:56 -07:00
7f7bc960a6 Set default Google Cloud os_image to coreos-stable 2018-03-19 00:08:26 -07:00
29108fd99d Improve changelog with migration links 2018-03-18 23:54:55 -07:00
18d08de898 Add Container Linux Config snippet docs 2018-03-18 23:22:40 -07:00
f3730b2bfa Add Container Linux Config snippets feature
* Introduce the ability to support Container Linux Config
"snippets" for controllers and workers on cloud platforms.
This allows end-users to customize hosts by providing Container
Linux configs that are additively merged into the base configs
defined by Typhoon. Config snippets are validated, merged, and
show any errors during `terraform plan`
* Example uses include adding systemd units, network configs,
mounts, files, raid arrays, or other disk provisioning features
provided by Container Linux Configs (using Ignition low-level)
* Requires terraform-provider-ct v0.2.1 plugin
2018-03-18 18:28:18 -07:00
88aa9a46e5 Add /var/lib/calico volume mount to Calico DaemonSet 2018-03-18 16:40:38 -07:00
efa90d8b44 Add a new key=value label to controller nodes
* Add a node-role.kubernetes.io/controller="true" node label
to controllers so Prometheus service discovery can filter to
services that only run on controllers (i.e. masters)
* Leave node-role.kubernetes.io/master="" untouched as its
a Kubernetes convention
2018-03-18 16:39:10 -07:00
46226a8015 Update Prometheus from 2.2.0 to 2.2.1 2018-03-18 15:56:44 -07:00
270d1ce357 Add links to upstream regressions 2018-03-14 18:56:20 -07:00
ab87b6cea3 Add clarifying links to CHANGES 2018-03-12 21:19:15 -07:00
d621512dd6 Promote AWS platform from beta to stable 2018-03-12 21:15:53 -07:00
c59a9c66b1 Revert "addons: Update from Grafana v4.6.3 to v5.0.0"
* Revert commit 9dcc255f8e.
* Grafana v5.0 is not compatible with Kubernetes v1.9.4. See
https://github.com/poseidon/typhoon/pull/162
2018-03-12 21:01:14 -07:00
21f2cef12f Improve changelog, README, and index page 2018-03-12 20:58:02 -07:00
931e311786 Update Kubernetes from v1.9.3 to v1.9.4 2018-03-12 18:07:50 -07:00
2592a0aad4 Allow Google accelerators (i.e. GPUs) on workers 2018-03-11 17:21:24 -07:00
6c5e287c29 Add details and links to the changelog 2018-03-11 17:07:07 -07:00
2a4595eeee Add links to the charitable donations list 2018-03-11 14:51:40 -07:00
8e7e6b9f7f Normalize Terraform configs with terraform fmt 2018-03-11 14:46:05 -07:00
35f3b1b28c Enable AWS NLB cross-zone load balancing
* https://github.com/terraform-providers/terraform-provider-aws/pull/3537
* https://aws.amazon.com/about-aws/whats-new/2018/02/network-load-balancer-now-supports-cross-zone-load-balancing/
2018-03-10 23:25:18 -08:00
9fb1e1a0e2 Update etcd from v3.3.1 to v3.3.2
* https://github.com/coreos/etcd/releases/tag/v3.3.2
2018-03-10 13:44:35 -08:00
b61d6373c5 Add ignore_changes for AWS worker image_id 2018-03-10 13:16:05 -08:00
42708f9a70 Update Prometheus from v2.2.0-rc.1 to v2.2.0
* https://github.com/prometheus/prometheus/releases/tag/v2.2.0
2018-03-09 00:20:40 -08:00
d54709f89c Update Grafana from v5.0.0 to 5.0.1
* https://github.com/grafana/grafana/releases/tag/v5.0.1
2018-03-09 00:20:40 -08:00
0e688ef05a Update CHANGES.md changelog with monitoring updates 2018-03-09 00:20:40 -08:00
9dcc255f8e addons: Update from Grafana v4.6.3 to v5.0.0 2018-03-09 00:20:40 -08:00
9307e97c46 addons: Update Prometheus from v2.1.0 to v2.2.0
* Annotate Prometheus service to scrape metrics from
Prometheus itself (enables Prometheus* alerts)
* Update kube-state-metrics addon-resizer to 1.7
* Use port 8080 for kube-state-metrics
* Add PrometheusNotIngestingSamples alert rule
* Change K8SKubeletDown alert rule to fire when 10%
of kubelets are down, not 1%
  * https://github.com/coreos/prometheus-operator/pull/1032
2018-03-09 00:20:40 -08:00
c112ee3829 Rename cluster_name to name in internal module
* Ensure consistency between AWS and GCP platforms
2018-03-03 17:52:01 -08:00
45b556c08f Fix overly strict firewall for GCP "worker pools"
* Fix issue where worker firewall rules didn't apply to
additional workers attached to a GCP cluster using the new
"worker pools" feature (unreleased, #148). Solves host
connection timeouts and pods not being scheduled to attached
worker pools.
* Add `name` field to GCP internal worker module to represent
the unique name of of the worker pool
* Use `cluster_name` field of GCP internal worker module for
passing the name of the cluster to which workers should be
attached
2018-03-03 17:40:17 -08:00
da6aafe816 Revert "Add module version requirements to internal workers modules"
* This reverts commit cce4537487.
* Provider passing to child modules is complex and the behavior
changed between Terraform v0.10 and v0.11. We're continuing to
allow both versions so this change should be reverted. For the
time being, those using our internal Terraform modules will have
to be aware of the minimum version for AWS and GCP providers,
there is no good way to do enforcement.
2018-03-03 16:56:34 -08:00
cce4537487 Add module version requirements to internal workers modules 2018-03-03 14:39:25 -08:00
73126eb7f8 Add support for worker pools on AWS
* Allow groups of workers to be defined and joined to
a cluster (i.e. worker pools)
* Move worker resources into a Terraform submodule
* Output variables needed for passing to worker pools
* Add usage docs for AWS worker pools (advanced)
2018-02-27 18:31:42 -08:00
160ae34e71 Add support for worker pools on google-cloud
* Set defaults for internal worker module's count,
machine_type, and os_image
* Allow "pools" of homogeneous workers to be created
using the google-cloud/kubernetes/workers module
2018-02-26 22:36:36 -08:00
06d40c5b44 Show os_image coreos-stable on Google Cloud
* Don't need to define a specific dated image. Managed
instance groups do not delete instances when new images
are released to a channel
2018-02-26 22:24:44 -08:00
98985e5acd Remove unused etcd_service_ip template variable
* etcd_service_ip dates back to deprecated self-hosted etcd
2018-02-26 22:20:20 -08:00
ea6bf9c9fb Improve links in tutorials and changelog notes 2018-02-26 12:55:32 -08:00
486fdb6968 Simplify CLC kubeconfig templating on AWS and GCP
* Template terraform-render-bootkube's multi-line kubeconfig
output using the right indentation
* Add `kubeconfig` variable to google-cloud controllers and
workers Terraform submodules
* Remove `kubeconfig_*` variables from google-cloud controllers
and workers Terraform submodules
2018-02-26 12:49:01 -08:00
a44cf0edbd Update Calico from v3.0.2 to v3.0.3
* https://github.com/projectcalico/calico/releases/tag/v3.0.3
2018-02-26 12:48:19 -08:00
983c7aa012 Recommend installing terraform-provider-ct v0.2.1
* Upcoming releases may begin to use features that require
the `terraform-provider-ct` plugin v0.2.1
* New users should use `terraform-provider-ct` v0.2.1. Existing
users can safely drop-in replace their v0.2.0 plugin with v0.2.1
as well (location referenced in ~/.terraformrc).
* See https://github.com/poseidon/typhoon/pull/145
2018-02-25 19:39:54 -08:00
3d9683b6e8 Update the Digital Ocean SSH fingerprint docs 2018-02-25 19:09:38 -08:00
0da7757ef4 Pass Digital Ocean ssh_fingerprints as a list
* Fix digital-ocean module to pass ssh_fingerprints
as a list since the module accepts a list
2018-02-25 19:03:33 -08:00
04c6613ff3 Mention the command that applies the changes 2018-02-25 17:15:42 -08:00
92600efd11 Remove author employment disclosure note
* Author no longer works for CoreOS / Red Hat
* Typhoon development continues as usual
2018-02-24 18:30:51 -08:00
66c64b4e45 List addons below platforms in CHANGES 2018-02-22 22:33:13 -08:00
13f3745093 Add kubelet --volume-plugin-dir flag
* Set Kubelet search path for flexvolume plugins
to /var/lib/kubelet/volumeplugins
* Add support for flexvolume plugins on AWS, GCE, and DO
* See 9548572d98 which added flexvolume support for bare-metal
2018-02-22 22:11:45 -08:00
c4914c326b Update bootkube and terraform-render-bootkube to v0.11.0 2018-02-22 21:53:26 -08:00
461fd46986 Update CHANGES.md with AWS ELB to NLB change 2018-02-22 21:36:35 -08:00
ceb5555222 Switch apiserver from ELB to a network load balancer 2018-02-22 16:10:31 -08:00
86420fd507 Rename namespace manifests to be applied first
* Ensure kubectl apply -R creates manifests in the right order
2018-02-22 01:04:30 -08:00
5c383f4184 addons: Update nginx-ingress from 0.10.2 to 0.11.0 2018-02-21 23:54:12 -08:00
22fa051002 Switch Ingress ELB to a network load balancer
* Require terraform-provider-aws 1.7 or higher
2018-02-20 17:34:38 -08:00
c8313751d7 Ignore lifecycle changes to the AWS controller ami 2018-02-15 19:48:39 -08:00
195d902ab6 Upgrade etcd from v3.2.15 to v3.3.1 2018-02-15 19:29:46 -08:00
c19a68b59b Update bootkube control-plane manifests
* Remove PersistentVolumeLabel admission controller flag
* Switch Deployments and DaemonSets to apps/v1
* Minor update to pod-checkpointer image version
2018-02-15 11:06:35 -08:00
de88fa5457 addons: Update Heapster from v1.5.0 to v1.5.1
* Switch to k8s.gcr.io vanity image name
* Add service account, Role, and ClusterRole for heapster
2018-02-15 10:57:47 -08:00
d9a0183f3f addons/nginx-ingress: Fix typo in GCP selector name 2018-02-14 03:07:36 -05:00
7e24c67608 Remove docs mention of the etcd-network-checkpointer
* etcd-network-checkpointer is no longer used, its a holdover
from the self-hosted etcd era
2018-02-13 16:19:03 -08:00
87 changed files with 1988 additions and 1574 deletions

View File

@ -4,6 +4,119 @@ Notable changes between versions.
## Latest
* Kubernetes [v1.10.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.10.md#v1100)
* Remove unused, unmaintained `pxe-worker` internal module
#### AWS
* Add `disk_type` optional variable for setting the EBS volume type ([#176](https://github.com/poseidon/typhoon/pull/176))
* Change default type from `standard` to `gp2`. Prometheus etcd alerts are tuned for fast disks.
#### Digital Ocean
* Ensure etcd secrets are only distributed to controller hosts, not workers.
* Remove `networking` optional variable. Only flannel works on Digital Ocean.
#### Google Cloud
* Add `disk_size` optional variable for setting instance disk size in GB
* Add `controller_type` optional variable for setting machine type for controllers
* Add `worker_type` optional variable for setting machine type for workers
* Remove `machine_type` optional variable. Use `controller_type` and `worker_type`.
#### Addons
* Update Grafana from v4.6.3 to v5.0.4 ([#153](https://github.com/poseidon/typhoon/pull/153), [#174](https://github.com/poseidon/typhoon/pull/174))
* Restrict dashboard organization role to Viewer
## v1.9.6
* Kubernetes [v1.9.6](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.9.md#v196)
* Update Calico from v3.0.3 to v3.0.4
#### Addons
* Update heapster from v1.5.1 to v1.5.2
## v1.9.5
* Kubernetes [v1.9.5](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.9.md#v195)
* Fix `subPath` volume mounts regression ([kubernetes#61076](https://github.com/kubernetes/kubernetes/issues/61076))
* Introduce [Container Linux Config snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) on cloud platforms ([#145](https://github.com/poseidon/typhoon/pull/145))
* Validate and additively merge custom Container Linux Configs during `terraform plan`
* Define files, systemd units, dropins, networkd configs, mounts, users, and more
* Require updating `terraform-provider-ct` plugin from v0.2.0 to v0.2.1
* Add `node-role.kubernetes.io/controller="true"` node label to controllers ([#160](https://github.com/poseidon/typhoon/pull/160))
#### AWS
* [Require](https://typhoon.psdn.io/topics/maintenance/#terraform-provider-ct-v021) updating `terraform-provider-ct` plugin from v0.2.0 to [v0.2.1](https://github.com/coreos/terraform-provider-ct/releases/tag/v0.2.1) (action required!)
#### Digital Ocean
* [Require](https://typhoon.psdn.io/topics/maintenance/#terraform-provider-ct-v021) updating `terraform-provider-ct` plugin from v0.2.0 to [v0.2.1](https://github.com/coreos/terraform-provider-ct/releases/tag/v0.2.1) (action required!)
#### Google Cloud
* [Require](https://typhoon.psdn.io/topics/maintenance/#terraform-provider-ct-v021) updating `terraform-provider-ct` plugin from v0.2.0 to [v0.2.1](https://github.com/coreos/terraform-provider-ct/releases/tag/v0.2.1) (action required!)
* Relax `os_image` to optional. Default to "coreos-stable".
#### Addons
* Update nginx-ingress from 0.11.0 to 0.12.0
* Update Prometheus from 2.2.0 to 2.2.1
## v1.9.4
* Kubernetes [v1.9.4](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.9.md#v194)
* Secret, configMap, downward API, and projected volumes now read-only (breaking, [kubernetes#58720](https://github.com/kubernetes/kubernetes/pull/58720))
* Regressed `subPath` volume mounts (regression, [kubernetes#61076](https://github.com/kubernetes/kubernetes/issues/61076))
* Mitigated `subPath` [CVE-2017-1002101](https://github.com/kubernetes/kubernetes/issues/60813)
* Introduce [worker pools](https://typhoon.psdn.io/advanced/worker-pools/) for AWS and Google Cloud for joining heterogeneous workers to existing clusters.
* Use new Network Load Balancers and cross zone load balancing on AWS
* Allow flexvolume plugins to be used on any Typhoon cluster (not just bare-metal)
* Upgrade etcd from v3.2.15 to v3.3.2
* Update Calico from v3.0.2 to v3.0.3
* Use kubernetes-incubator/bootkube v0.10.0
* [Recommend](https://typhoon.psdn.io/topics/maintenance/#terraform-provider-ct-v021) updating `terraform-provider-ct` plugin from v0.2.0 to [v0.2.1](https://github.com/coreos/terraform-provider-ct/releases/tag/v0.2.1) (action recommended)
#### AWS
* Promote AWS platform to stable
* Allow groups of workers to be defined and joined to a cluster (i.e. worker pools) ([#150](https://github.com/poseidon/typhoon/pull/150))
* Replace the apiserver elastic load balancer with a network load balancer ([#136](https://github.com/poseidon/typhoon/pull/136))
* Replace the Ingress elastic load balancer with a network load balancer ([#141](https://github.com/poseidon/typhoon/pull/141))
* AWS [NLBs](https://aws.amazon.com/blogs/aws/new-network-load-balancer-effortless-scaling-to-millions-of-requests-per-second/) can handle millions of RPS with high throughput and low latency.
* Require `terraform-provider-aws` 1.7.0 or higher
* Enable NLB [cross-zone](https://aws.amazon.com/about-aws/whats-new/2018/02/network-load-balancer-now-supports-cross-zone-load-balancing/) load balancing ([#159](https://github.com/poseidon/typhoon/pull/159))
* Requests are automatically evenly distributed to targets regardless of AZ
* Require `terraform-provider-aws` 1.11.0 or higher
* Add kubelet `--volume-plugin-dir` flag to allow flexvolume plugins ([#142](https://github.com/poseidon/typhoon/pull/142))
* Fix controller and worker launch configs to ignore AMI changes ([#126](https://github.com/poseidon/typhoon/pull/126), [#158](https://github.com/poseidon/typhoon/pull/158))
#### Digital Ocean
* Add kubelet `--volume-plugin-dir` flag to allow flexvolume plugins ([#142](https://github.com/poseidon/typhoon/pull/142))
* Fix to pass `ssh_fingerprints` as a list to droplets ([#143](https://github.com/poseidon/typhoon/pull/143))
#### Google Cloud
* Allow groups of workers to be defined and joined to a cluster (i.e. worker pools) ([#148](https://github.com/poseidon/typhoon/pull/148))
* Add kubelet `--volume-plugin-dir` flag to allow flexvolume plugins ([#142](https://github.com/poseidon/typhoon/pull/142))
* Add `kubeconfig` variable to `controllers` and `workers` submodules ([#147](https://github.com/poseidon/typhoon/pull/147))
* Remove `kubeconfig_*` variables from `controllers` and `workers` submodules ([#147](https://github.com/poseidon/typhoon/pull/147))
* Allow initial experimentation with accelerators (i.e. GPUs) on workers ([#161](https://github.com/poseidon/typhoon/pull/161)) (unofficial)
* Require `terraform-provider-google` v1.6.0
#### Addons
* Update Prometheus from 2.1.0 to 2.2.0 ([#153](https://github.com/poseidon/typhoon/pull/153))
* Scrape Prometheus itself to enable alerts about Prometheus itself
* Adjust KubeletDown rule to fire when 10% of kubelets are down
* Update heapster from v1.5.0 to v1.5.1 ([#131](https://github.com/poseidon/typhoon/pull/131))
* Use separate service account
* Update nginx-ingress from 0.10.2 to 0.11.0
## v1.9.3
* Kubernetes [v1.9.3](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.9.md#v193)
@ -14,6 +127,11 @@ Notable changes between versions.
* Use separate service account for flannel
* Update etcd from v3.2.14 to v3.2.15
#### Digital Ocean
* Use new Droplet [types](https://developers.digitalocean.com/documentation/changelog/api-v2/new-size-slugs-for-droplet-plan-changes/) which offer more CPU/memory, at lower cost. ([#105](https://github.com/poseidon/typhoon/pull/105))
* A small Digital Ocean cluster costs less than $25 a month!
#### Addons
* Update Prometheus from v2.0.0 to v2.1.0 ([#113](https://github.com/poseidon/typhoon/pull/113))
@ -28,31 +146,19 @@ Notable changes between versions.
* Switch manifests to use `apps/v1` Deployments and Daemonsets ([#120](https://github.com/poseidon/typhoon/pull/120))
* Remove Kubernetes Dashboard manifests ([#121](https://github.com/poseidon/typhoon/pull/121))
#### Digital Ocean
* Use new Droplet [types](https://developers.digitalocean.com/documentation/changelog/api-v2/new-size-slugs-for-droplet-plan-changes/) which offer more CPU/memory, at lower cost. ([#105](https://github.com/poseidon/typhoon/pull/105))
* A small Digital Ocean cluster costs less than $25 a month!
## v1.9.2
* Kubernetes [v1.9.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.9.md#v192)
* Add Terraform v0.11.x support
* Add explicit "providers" section to modules for Terraform v0.11.x
* Retain support for Terraform v0.10.4+
* Add [migration guide](https://github.com/poseidon/typhoon/blob/master/docs/topics/maintenance.md) from Terraform v0.10.x to v0.11.x (**action required!**)
* Add [migration guide](https://typhoon.psdn.io/topics/maintenance/#terraform-v011x) from Terraform v0.10.x to v0.11.x (**action required!**)
* Update etcd from 3.2.13 to 3.2.14
* Update calico from 2.6.5 to 2.6.6
* Update kube-dns from v1.14.7 to v1.14.8
* Use separate service account for kube-dns
* Use kubernetes-incubator/bootkube v0.10.0
#### Addons
* Update CLUO to v0.5.0 to fix compatibility with Kubernetes 1.9 (**important**)
* Earlier versions can't roll out Container Linux updates on Kubernetes 1.9 nodes ([cluo#163](https://github.com/coreos/container-linux-update-operator/issues/163))
* Update kube-state-metrics from v1.1.0 to v1.2.0
* Fix RBAC cluster role for kube-state-metrics
#### Bare-Metal
* Use per-node Container Linux install profiles ([#97](https://github.com/poseidon/typhoon/pull/97))
@ -64,6 +170,13 @@ Notable changes between versions.
* Relax `digitalocean` provider version constraint
* Fix bug with `terraform plan` always showing a firewall diff to be applied ([#3](https://github.com/poseidon/typhoon/issues/3))
#### Addons
* Update CLUO to v0.5.0 to fix compatibility with Kubernetes 1.9 (**important**)
* Earlier versions can't roll out Container Linux updates on Kubernetes 1.9 nodes ([cluo#163](https://github.com/coreos/container-linux-update-operator/issues/163))
* Update kube-state-metrics from v1.1.0 to v1.2.0
* Fix RBAC cluster role for kube-state-metrics
## v1.9.1
* Kubernetes [v1.9.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.9.md#v191)

View File

@ -11,10 +11,11 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.9.3 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Kubernetes v1.10.0 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Ready for Ingress, Dashboards, Metrics, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/) and [preemption](https://typhoon.psdn.io/google-cloud/#preemption) (varies by platform)
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
## Modules
@ -22,7 +23,7 @@ Typhoon provides a Terraform Module for each supported operating system and plat
| Platform | Operating System | Terraform Module | Status |
|---------------|------------------|------------------|--------|
| AWS | Container Linux | [aws/container-linux/kubernetes](aws/container-linux/kubernetes) | beta |
| AWS | Container Linux | [aws/container-linux/kubernetes](aws/container-linux/kubernetes) | stable |
| Bare-Metal | Container Linux | [bare-metal/container-linux/kubernetes](bare-metal/container-linux/kubernetes) | stable |
| Digital Ocean | Container Linux | [digital-ocean/container-linux/kubernetes](digital-ocean/container-linux/kubernetes) | beta |
| Google Cloud | Container Linux | [google-cloud/container-linux/kubernetes](google-cloud/container-linux/kubernetes) | beta |
@ -43,29 +44,28 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo
```tf
module "google-cloud-yavin" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes"
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.10.0"
providers = {
google = "google.default"
local = "local.default"
null = "null.default"
google = "google.default"
local = "local.default"
null = "null.default"
template = "template.default"
tls = "tls.default"
tls = "tls.default"
}
# Google Cloud
cluster_name = "yavin"
region = "us-central1"
dns_zone = "example.com"
dns_zone_name = "example-zone"
os_image = "coreos-stable-1576-5-0-v20180105"
cluster_name = "yavin"
controller_count = 1
worker_count = 2
# configuration
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
# output assets dir
asset_dir = "/home/user/.secrets/clusters/yavin"
asset_dir = "/home/user/.secrets/clusters/yavin"
# optional
worker_count = 2
}
```
@ -86,9 +86,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
yavin-controller-0.c.example-com.internal Ready 6m v1.9.3
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.9.3
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.9.3
yavin-controller-0.c.example-com.internal Ready 6m v1.10.0
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.10.0
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.10.0
```
List the pods.
@ -123,11 +123,11 @@ Typhoon is strict about minimalism, maturity, and scope. These are not in scope:
Ask questions on the IRC #typhoon channel on [freenode.net](http://freenode.net/).
## Background
## Motivation
Typhoon powers the author's cloud and colocation clusters. The project has evolved through operational experience and Kubernetes changes. Typhoon is shared under a free license to allow others to use the work freely and contribute to its upkeep.
Typhoon addresses real world needs, which you may share. It is honest about limitations or areas that aren't mature yet. It avoids buzzword bingo and hype. It does not aim to be the one-solution-fits-all distro. An ecosystem of free (or enterprise) Kubernetes distros is healthy.
Typhoon addresses real world needs, which you may share. It is honest about limitations or areas that aren't mature yet. It avoids buzzword bingo and hype. It does not aim to be the one-solution-fits-all distro. An ecosystem of Kubernetes distributions is healthy.
## Social Contract
@ -135,4 +135,6 @@ Typhoon is not a product, trial, or free-tier. It is not run by a company, does
Typhoon clusters will contain only [free](https://www.debian.org/intro/free) components. Cluster components will not collect data on users without their permission.
*Disclosure: The author works for Red Hat (prev CoreOS), but Typhoon is unassociated and maintained independently.*
## Donations
Typhoon does not accept money donations. Instead, we encourage you to donate to one of [these organizations](https://github.com/poseidon/typhoon/wiki/Donations) to show your appreciation.

View File

@ -0,0 +1,15 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-dashboard-providers
namespace: monitoring
data:
dashboard-providers.yaml: |+
apiVersion: 1
providers:
- name: 'default'
ordId: 1
folder: ''
type: file
options:
path: /var/lib/grafana/dashboards

View File

@ -5,14 +5,12 @@ metadata:
namespace: monitoring
data:
deployment-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -39,7 +37,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -110,7 +108,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -181,7 +179,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "Bps",
"gauge": {
@ -262,7 +260,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -333,7 +331,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -403,7 +401,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -473,7 +471,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -550,7 +548,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -665,7 +663,7 @@ data:
{
"allValue": ".*",
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": false,
"label": "Namespace",
@ -685,7 +683,7 @@ data:
{
"allValue": null,
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": false,
"label": "Deployment",
@ -737,24 +735,11 @@ data:
"title": "Deployment",
"version": 1
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
etcd-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"name": "DS_PROMETHEUS",
"name": "prometheus",
"label": "prometheus",
"description": "",
"type": "datasource",
@ -813,7 +798,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"format": "none",
@ -889,7 +874,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 0,
@ -978,7 +963,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 0,
@ -1079,7 +1064,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"decimals": null,
"editable": false,
"error": false,
@ -1161,7 +1146,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 0,
@ -1250,7 +1235,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 0,
@ -1342,7 +1327,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 5,
@ -1422,7 +1407,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 5,
@ -1502,7 +1487,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 0,
@ -1582,7 +1567,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"decimals": null,
"editable": false,
"error": false,
@ -1676,7 +1661,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 0,
@ -1782,7 +1767,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"decimals": 0,
"editable": false,
"error": false,
@ -1909,26 +1894,13 @@ data:
"title": "etcd",
"version": 4
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
kubernetes-capacity-planning-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -1954,7 +1926,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -2032,7 +2004,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -2134,7 +2106,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -2250,7 +2222,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -2333,7 +2305,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -2440,7 +2412,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percentunit",
"gauge": {
@ -2522,7 +2494,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -2604,7 +2576,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -2695,7 +2667,7 @@ data:
"aliasColors": {},
"bars": false,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -2782,7 +2754,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -2897,26 +2869,13 @@ data:
"title": "Kubernetes Capacity Planning",
"version": 4
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
kubernetes-cluster-health-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -2944,7 +2903,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3025,7 +2984,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3101,7 +3060,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3177,7 +3136,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3263,7 +3222,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3339,7 +3298,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3415,7 +3374,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3491,7 +3450,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3605,26 +3564,13 @@ data:
"title": "Kubernetes Cluster Health",
"version": 9
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
kubernetes-cluster-status-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -3651,7 +3597,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3723,7 +3669,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3805,7 +3751,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -3877,7 +3823,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -3949,7 +3895,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4021,7 +3967,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -4103,7 +4049,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4175,7 +4121,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4247,7 +4193,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4319,7 +4265,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4429,26 +4375,13 @@ data:
"title": "Kubernetes Cluster Status",
"version": 3
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
kubernetes-control-plane-status-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -4475,7 +4408,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4550,7 +4483,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4625,7 +4558,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4700,7 +4633,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4783,7 +4716,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -4869,7 +4802,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -4944,7 +4877,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -5069,26 +5002,13 @@ data:
"title": "Kubernetes Control Plane Status",
"version": 3
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
kubernetes-resource-requests-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -5113,7 +5033,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"description": "This represents the total [CPU resource requests](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#meaning-of-cpu) in the cluster.\nFor comparison the total [allocatable CPU cores](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node-allocatable.md) is also shown.",
"editable": false,
"error": false,
@ -5202,7 +5122,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -5284,7 +5204,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"description": "This represents the total [memory resource requests](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#meaning-of-memory) in the cluster.\nFor comparison the total [allocatable memory](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node-allocatable.md) is also shown.",
"editable": false,
"error": false,
@ -5373,7 +5293,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -5486,26 +5406,13 @@ data:
"title": "Kubernetes Resource Requests",
"version": 2
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
nodes-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -5532,7 +5439,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -5611,7 +5518,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -5713,7 +5620,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -5825,7 +5732,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -5907,7 +5814,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -6014,7 +5921,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percentunit",
"gauge": {
@ -6096,7 +6003,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -6178,7 +6085,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -6270,7 +6177,7 @@ data:
{
"allValue": null,
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": false,
"label": null,
@ -6322,26 +6229,13 @@ data:
"title": "Nodes",
"version": 2
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
pods-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -6366,7 +6260,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -6472,7 +6366,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -6576,7 +6470,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -6662,7 +6556,7 @@ data:
{
"allValue": ".*",
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": true,
"label": "Namespace",
@ -6682,7 +6576,7 @@ data:
{
"allValue": null,
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": false,
"label": "Pod",
@ -6702,7 +6596,7 @@ data:
{
"allValue": ".*",
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": true,
"label": "Container",
@ -6754,26 +6648,13 @@ data:
"title": "Pods",
"version": 1
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
statefulset-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -6800,7 +6681,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -6871,7 +6752,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -6942,7 +6823,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "Bps",
"gauge": {
@ -7023,7 +6904,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -7094,7 +6975,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -7164,7 +7045,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -7234,7 +7115,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -7311,7 +7192,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -7405,7 +7286,7 @@ data:
{
"allValue": ".*",
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": false,
"label": "Namespace",
@ -7425,7 +7306,7 @@ data:
{
"allValue": null,
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": false,
"label": "StatefulSet",
@ -7477,23 +7358,4 @@ data:
"title": "StatefulSet",
"version": 1
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
prometheus-datasource.json: |+
{
"access": "proxy",
"basicAuth": false,
"name": "prometheus",
"type": "prometheus",
"url": "http://prometheus.monitoring.svc"
}
---

View File

@ -0,0 +1,16 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-datasources
namespace: monitoring
data:
prometheus.yaml: |+
apiVersion: 1
datasources:
- name: prometheus
type: prometheus
access: proxy
orgId: 1
url: http://prometheus.monitoring.svc.cluster.local
version: 1
editable: false

View File

@ -21,7 +21,7 @@ spec:
spec:
containers:
- name: grafana
image: grafana/grafana:4.6.3
image: grafana/grafana:5.0.4
env:
- name: GF_SERVER_HTTP_PORT
value: "8080"
@ -30,7 +30,7 @@ spec:
- name: GF_AUTH_ANONYMOUS_ENABLED
value: "true"
- name: GF_AUTH_ANONYMOUS_ORG_ROLE
value: Admin
value: Viewer
ports:
- name: http
containerPort: 8080
@ -41,22 +41,20 @@ spec:
limits:
memory: 200Mi
cpu: 200m
- name: grafana-watcher
image: quay.io/coreos/grafana-watcher:v0.0.8
args:
- '--watch-dir=/etc/grafana/dashboards'
- '--grafana-url=http://localhost:8080'
resources:
requests:
memory: "16Mi"
cpu: "50m"
limits:
memory: "32Mi"
cpu: "100m"
volumeMounts:
- name: dashboards
mountPath: /etc/grafana/dashboards
- name: datasources
mountPath: /etc/grafana/provisioning/datasources
- name: dashboard-providers
mountPath: /etc/grafana/provisioning/dashboards
- name: dashboards
mountPath: /var/lib/grafana/dashboards
volumes:
- name: datasources
configMap:
name: grafana-datasources
- name: dashboard-providers
configMap:
name: grafana-dashboard-providers
- name: dashboards
configMap:
name: grafana-dashboards

View File

@ -0,0 +1,12 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: heapster
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:heapster
subjects:
- kind: ServiceAccount
name: heapster
namespace: kube-system

View File

@ -14,12 +14,11 @@ spec:
labels:
name: heapster
phase: prod
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
serviceAccountName: heapster
containers:
- name: heapster
image: gcr.io/google_containers/heapster-amd64:v1.5.0
image: k8s.gcr.io/heapster-amd64:v1.5.2
command:
- /heapster
- --source=kubernetes.summary_api:''
@ -31,7 +30,7 @@ spec:
initialDelaySeconds: 180
timeoutSeconds: 5
- name: heapster-nanny
image: gcr.io/google_containers/addon-resizer:1.7
image: k8s.gcr.io/addon-resizer:1.7
command:
- /pod_nanny
- --cpu=80m

View File

@ -0,0 +1,13 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: heapster
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: system:pod-nanny
subjects:
- kind: ServiceAccount
name: heapster
namespace: kube-system

19
addons/heapster/role.yaml Normal file
View File

@ -0,0 +1,19 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: system:pod-nanny
namespace: kube-system
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- "extensions"
resources:
- deployments
verbs:
- get
- update

View File

@ -0,0 +1,5 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: heapster
namespace: kube-system

View File

@ -23,7 +23,7 @@ spec:
hostNetwork: true
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.10.2
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.12.0
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-backend

View File

@ -23,7 +23,7 @@ spec:
hostNetwork: true
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.10.2
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.12.0
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-backend

View File

@ -10,7 +10,7 @@ spec:
maxUnavailable: 1
selector:
matchLabels:
name: nginx-ingess-controller
name: nginx-ingress-controller
phase: prod
template:
metadata:
@ -23,7 +23,7 @@ spec:
hostNetwork: true
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.10.2
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.12.0
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-backend

View File

@ -18,7 +18,7 @@ spec:
serviceAccountName: prometheus
containers:
- name: prometheus
image: quay.io/prometheus/prometheus:v2.1.0
image: quay.io/prometheus/prometheus:v2.2.1
args:
- '--config.file=/etc/prometheus/prometheus.yaml'
ports:

View File

@ -33,7 +33,7 @@ spec:
initialDelaySeconds: 5
timeoutSeconds: 5
- name: addon-resizer
image: gcr.io/google_containers/addon-resizer:1.0
image: gcr.io/google_containers/addon-resizer:1.7
resources:
limits:
cpu: 100m
@ -54,8 +54,8 @@ spec:
- /pod_nanny
- --container=kube-state-metrics
- --cpu=100m
- --extra-cpu=2m
- --memory=150Mi
- --extra-memory=30Mi
- --extra-cpu=1m
- --memory=100Mi
- --extra-memory=2Mi
- --threshold=5
- --deployment=kube-state-metrics

View File

@ -15,5 +15,5 @@ spec:
ports:
- name: metrics
protocol: TCP
port: 80
port: 8080
targetPort: 8080

View File

@ -353,7 +353,7 @@ data:
description: Prometheus failed to scrape {{ $value }}% of kubelets.
- alert: K8SKubeletDown
expr: (absent(up{job="kubelet"} == 1) or count(up{job="kubelet"} == 0) / count(up{job="kubelet"}))
* 100 > 1
* 100 > 10
for: 1h
labels:
severity: critical
@ -588,3 +588,11 @@ data:
description: '{{$labels.job}} at {{$labels.instance}} has a corrupted write-ahead
log (WAL).'
summary: Prometheus write-ahead log is corrupted
- alert: PrometheusNotIngestingSamples
expr: rate(prometheus_tsdb_head_samples_appended_total[5m]) <= 0
for: 10m
labels:
severity: warning
annotations:
description: "Prometheus {{ $labels.namespace }}/{{ $labels.pod}} isn't ingesting samples."
summary: "Prometheus isn't ingesting samples"

View File

@ -3,6 +3,8 @@ kind: Service
metadata:
name: prometheus
namespace: monitoring
annotations:
prometheus.io/scrape: 'true'
spec:
type: ClusterIP
selector:

View File

@ -11,10 +11,11 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.9.3 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Kubernetes v1.10.0 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Ready for Ingress, Dashboards, Metrics, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/)
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
## Docs

View File

@ -0,0 +1,69 @@
# kube-apiserver Network Load Balancer DNS Record
resource "aws_route53_record" "apiserver" {
zone_id = "${var.dns_zone_id}"
name = "${format("%s.%s.", var.cluster_name, var.dns_zone)}"
type = "A"
# AWS recommends their special "alias" records for ELBs
alias {
name = "${aws_lb.apiserver.dns_name}"
zone_id = "${aws_lb.apiserver.zone_id}"
evaluate_target_health = true
}
}
# Network Load Balancer for apiservers
resource "aws_lb" "apiserver" {
name = "${var.cluster_name}-apiserver"
load_balancer_type = "network"
internal = false
subnets = ["${aws_subnet.public.*.id}"]
enable_cross_zone_load_balancing = true
}
# Forward HTTP traffic to controllers
resource "aws_lb_listener" "apiserver-https" {
load_balancer_arn = "${aws_lb.apiserver.arn}"
protocol = "TCP"
port = "443"
default_action {
type = "forward"
target_group_arn = "${aws_lb_target_group.controllers.arn}"
}
}
# Target group of controllers
resource "aws_lb_target_group" "controllers" {
name = "${var.cluster_name}-controllers"
vpc_id = "${aws_vpc.network.id}"
target_type = "instance"
protocol = "TCP"
port = 443
# Kubelet HTTP health check
health_check {
protocol = "TCP"
port = 443
# NLBs required to use same healthy and unhealthy thresholds
healthy_threshold = 3
unhealthy_threshold = 3
# Interval between health checks required to be 10 or 30
interval = 10
}
}
# Attach controller instances to apiserver NLB
resource "aws_lb_target_group_attachment" "controllers" {
count = "${var.controller_count}"
target_group_arn = "${aws_lb_target_group.controllers.arn}"
target_id = "${element(aws_instance.controllers.*.id, count.index)}"
port = 443
}

View File

@ -1,6 +1,6 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=203b90169ead2380f74cc64ea1f02c109806c9bc"
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=5f3546b66ffb9946b36e612537bb6a1830ae7746"
cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]

View File

@ -7,7 +7,7 @@ systemd:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.2.15"
Environment="ETCD_IMAGE_TAG=v3.3.2"
Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
@ -66,6 +66,7 @@ systemd:
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
@ -80,8 +81,10 @@ systemd:
--lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni \
--node-labels=node-role.kubernetes.io/master \
--node-labels=node-role.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
RestartSec=10
@ -107,29 +110,14 @@ storage:
mode: 0644
contents:
inline: |
apiVersion: v1
kind: Config
clusters:
- name: local
cluster:
server: ${kubeconfig_server}
certificate-authority-data: ${kubeconfig_ca_cert}
users:
- name: kubelet
user:
client-certificate-data: ${kubeconfig_kubelet_cert}
client-key-data: ${kubeconfig_kubelet_key}
contexts:
- context:
cluster: local
user: kubelet
${kubeconfig}
- path: /etc/kubernetes/kubelet.env
filesystem: root
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.3
KUBELET_IMAGE_TAG=v1.10.0
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -150,7 +138,7 @@ storage:
# Move experimental manifests
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.10.0}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.11.0}"
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
exec /usr/bin/rkt run \
--trust-keys-from-https \

View File

@ -28,7 +28,7 @@ resource "aws_instance" "controllers" {
# storage
root_block_device {
volume_type = "standard"
volume_type = "${var.disk_type}"
volume_size = "${var.disk_size}"
}
@ -36,6 +36,10 @@ resource "aws_instance" "controllers" {
associate_public_ip_address = true
subnet_id = "${element(aws_subnet.public.*.id, count.index)}"
vpc_security_group_ids = ["${aws_security_group.controller.id}"]
lifecycle {
ignore_changes = ["ami"]
}
}
# Controller Container Linux Config
@ -52,13 +56,10 @@ data "template_file" "controller_config" {
# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...
etcd_initial_cluster = "${join(",", formatlist("%s=https://%s:2380", null_resource.repeat.*.triggers.name, null_resource.repeat.*.triggers.domain))}"
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
kubeconfig_ca_cert = "${module.bootkube.ca_cert}"
kubeconfig_kubelet_cert = "${module.bootkube.kubelet_cert}"
kubeconfig_kubelet_key = "${module.bootkube.kubelet_key}"
kubeconfig_server = "${module.bootkube.server}"
kubeconfig = "${indent(10, module.bootkube.kubeconfig)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
}
}
@ -77,186 +78,5 @@ data "ct_config" "controller_ign" {
count = "${var.controller_count}"
content = "${element(data.template_file.controller_config.*.rendered, count.index)}"
pretty_print = false
}
# Security Group (instance firewall)
resource "aws_security_group" "controller" {
name = "${var.cluster_name}-controller"
description = "${var.cluster_name} controller security group"
vpc_id = "${aws_vpc.network.id}"
tags = "${map("Name", "${var.cluster_name}-controller")}"
}
resource "aws_security_group_rule" "controller-icmp" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "icmp"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "controller-ssh" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 22
to_port = 22
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "controller-apiserver" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 443
to_port = 443
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "controller-etcd" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 2379
to_port = 2380
self = true
}
resource "aws_security_group_rule" "controller-flannel" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "udp"
from_port = 8472
to_port = 8472
source_security_group_id = "${aws_security_group.worker.id}"
}
resource "aws_security_group_rule" "controller-flannel-self" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "udp"
from_port = 8472
to_port = 8472
self = true
}
resource "aws_security_group_rule" "controller-node-exporter" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 9100
to_port = 9100
source_security_group_id = "${aws_security_group.worker.id}"
}
resource "aws_security_group_rule" "controller-kubelet-self" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 10250
to_port = 10250
self = true
}
resource "aws_security_group_rule" "controller-kubelet-read" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 10255
to_port = 10255
source_security_group_id = "${aws_security_group.worker.id}"
}
resource "aws_security_group_rule" "controller-kubelet-read-self" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 10255
to_port = 10255
self = true
}
resource "aws_security_group_rule" "controller-bgp" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 179
to_port = 179
source_security_group_id = "${aws_security_group.worker.id}"
}
resource "aws_security_group_rule" "controller-bgp-self" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 179
to_port = 179
self = true
}
resource "aws_security_group_rule" "controller-ipip" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = 4
from_port = 0
to_port = 0
source_security_group_id = "${aws_security_group.worker.id}"
}
resource "aws_security_group_rule" "controller-ipip-self" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = 4
from_port = 0
to_port = 0
self = true
}
resource "aws_security_group_rule" "controller-ipip-legacy" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = 94
from_port = 0
to_port = 0
source_security_group_id = "${aws_security_group.worker.id}"
}
resource "aws_security_group_rule" "controller-ipip-legacy-self" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = 94
from_port = 0
to_port = 0
self = true
}
resource "aws_security_group_rule" "controller-egress" {
security_group_id = "${aws_security_group.controller.id}"
type = "egress"
protocol = "-1"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
snippets = ["${var.controller_clc_snippets}"]
}

View File

@ -1,43 +0,0 @@
# kube-apiserver Network Load Balancer DNS Record
resource "aws_route53_record" "apiserver" {
zone_id = "${var.dns_zone_id}"
name = "${format("%s.%s.", var.cluster_name, var.dns_zone)}"
type = "A"
# AWS recommends their special "alias" records for ELBs
alias {
name = "${aws_elb.apiserver.dns_name}"
zone_id = "${aws_elb.apiserver.zone_id}"
evaluate_target_health = true
}
}
# Controller Network Load Balancer
resource "aws_elb" "apiserver" {
name = "${var.cluster_name}-apiserver"
subnets = ["${aws_subnet.public.*.id}"]
security_groups = ["${aws_security_group.controller.id}"]
listener {
lb_port = 443
lb_protocol = "tcp"
instance_port = 443
instance_protocol = "tcp"
}
instances = ["${aws_instance.controllers.*.id}"]
# Kubelet HTTP health check
health_check {
target = "SSL:443"
healthy_threshold = 2
unhealthy_threshold = 4
timeout = 5
interval = 6
}
idle_timeout = 3600
connection_draining = true
connection_draining_timeout = 300
}

View File

@ -1,32 +0,0 @@
# Ingress Network Load Balancer
resource "aws_elb" "ingress" {
name = "${var.cluster_name}-ingress"
subnets = ["${aws_subnet.public.*.id}"]
security_groups = ["${aws_security_group.worker.id}"]
listener {
lb_port = 80
lb_protocol = "tcp"
instance_port = 80
instance_protocol = "tcp"
}
listener {
lb_port = 443
lb_protocol = "tcp"
instance_port = 443
instance_protocol = "tcp"
}
# Ingress Controller HTTP health check
health_check {
target = "HTTP:10254/healthz"
healthy_threshold = 2
unhealthy_threshold = 4
timeout = 5
interval = 6
}
connection_draining = true
connection_draining_timeout = 300
}

View File

@ -1,4 +1,25 @@
output "ingress_dns_name" {
value = "${aws_elb.ingress.dns_name}"
description = "DNS name of the ELB for distributing traffic to Ingress controllers"
value = "${module.workers.ingress_dns_name}"
description = "DNS name of the network load balancer for distributing traffic to Ingress controllers"
}
# Outputs for worker pools
output "vpc_id" {
value = "${aws_vpc.network.id}"
description = "ID of the VPC for creating worker instances"
}
output "subnet_ids" {
value = ["${aws_subnet.public.*.id}"]
description = "List of subnet IDs for creating worker instances"
}
output "worker_security_groups" {
value = ["${aws_security_group.worker.id}"]
description = "List of worker security group IDs"
}
output "kubeconfig" {
value = "${module.bootkube.kubeconfig}"
}

View File

@ -5,7 +5,7 @@ terraform {
}
provider "aws" {
version = "~> 1.0"
version = "~> 1.11"
}
provider "local" {

View File

@ -0,0 +1,385 @@
# Security Groups (instance firewalls)
# Controller security group
resource "aws_security_group" "controller" {
name = "${var.cluster_name}-controller"
description = "${var.cluster_name} controller security group"
vpc_id = "${aws_vpc.network.id}"
tags = "${map("Name", "${var.cluster_name}-controller")}"
}
resource "aws_security_group_rule" "controller-icmp" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "icmp"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "controller-ssh" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 22
to_port = 22
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "controller-apiserver" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 443
to_port = 443
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "controller-etcd" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 2379
to_port = 2380
self = true
}
resource "aws_security_group_rule" "controller-flannel" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "udp"
from_port = 8472
to_port = 8472
source_security_group_id = "${aws_security_group.worker.id}"
}
resource "aws_security_group_rule" "controller-flannel-self" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "udp"
from_port = 8472
to_port = 8472
self = true
}
resource "aws_security_group_rule" "controller-node-exporter" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 9100
to_port = 9100
source_security_group_id = "${aws_security_group.worker.id}"
}
resource "aws_security_group_rule" "controller-kubelet-self" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 10250
to_port = 10250
self = true
}
resource "aws_security_group_rule" "controller-kubelet-read" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 10255
to_port = 10255
source_security_group_id = "${aws_security_group.worker.id}"
}
resource "aws_security_group_rule" "controller-kubelet-read-self" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 10255
to_port = 10255
self = true
}
resource "aws_security_group_rule" "controller-bgp" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 179
to_port = 179
source_security_group_id = "${aws_security_group.worker.id}"
}
resource "aws_security_group_rule" "controller-bgp-self" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 179
to_port = 179
self = true
}
resource "aws_security_group_rule" "controller-ipip" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = 4
from_port = 0
to_port = 0
source_security_group_id = "${aws_security_group.worker.id}"
}
resource "aws_security_group_rule" "controller-ipip-self" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = 4
from_port = 0
to_port = 0
self = true
}
resource "aws_security_group_rule" "controller-ipip-legacy" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = 94
from_port = 0
to_port = 0
source_security_group_id = "${aws_security_group.worker.id}"
}
resource "aws_security_group_rule" "controller-ipip-legacy-self" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = 94
from_port = 0
to_port = 0
self = true
}
resource "aws_security_group_rule" "controller-egress" {
security_group_id = "${aws_security_group.controller.id}"
type = "egress"
protocol = "-1"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}
# Worker security group
resource "aws_security_group" "worker" {
name = "${var.cluster_name}-worker"
description = "${var.cluster_name} worker security group"
vpc_id = "${aws_vpc.network.id}"
tags = "${map("Name", "${var.cluster_name}-worker")}"
}
resource "aws_security_group_rule" "worker-icmp" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "icmp"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "worker-ssh" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 22
to_port = 22
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "worker-http" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 80
to_port = 80
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "worker-https" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 443
to_port = 443
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "worker-flannel" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "udp"
from_port = 8472
to_port = 8472
source_security_group_id = "${aws_security_group.controller.id}"
}
resource "aws_security_group_rule" "worker-flannel-self" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "udp"
from_port = 8472
to_port = 8472
self = true
}
resource "aws_security_group_rule" "worker-node-exporter" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 9100
to_port = 9100
self = true
}
resource "aws_security_group_rule" "ingress-health" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 10254
to_port = 10254
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "worker-kubelet" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 10250
to_port = 10250
source_security_group_id = "${aws_security_group.controller.id}"
}
resource "aws_security_group_rule" "worker-kubelet-self" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 10250
to_port = 10250
self = true
}
resource "aws_security_group_rule" "worker-kubelet-read" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 10255
to_port = 10255
source_security_group_id = "${aws_security_group.controller.id}"
}
resource "aws_security_group_rule" "worker-kubelet-read-self" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 10255
to_port = 10255
self = true
}
resource "aws_security_group_rule" "worker-bgp" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 179
to_port = 179
source_security_group_id = "${aws_security_group.controller.id}"
}
resource "aws_security_group_rule" "worker-bgp-self" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 179
to_port = 179
self = true
}
resource "aws_security_group_rule" "worker-ipip" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = 4
from_port = 0
to_port = 0
source_security_group_id = "${aws_security_group.controller.id}"
}
resource "aws_security_group_rule" "worker-ipip-self" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = 4
from_port = 0
to_port = 0
self = true
}
resource "aws_security_group_rule" "worker-ipip-legacy" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = 94
from_port = 0
to_port = 0
source_security_group_id = "${aws_security_group.controller.id}"
}
resource "aws_security_group_rule" "worker-ipip-legacy-self" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = 94
from_port = 0
to_port = 0
self = true
}
resource "aws_security_group_rule" "worker-egress" {
security_group_id = "${aws_security_group.worker.id}"
type = "egress"
protocol = "-1"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}

View File

@ -1,5 +1,5 @@
# Secure copy etcd TLS assets and kubeconfig to controllers. Activates kubelet.service
resource "null_resource" "copy-secrets" {
# Secure copy etcd TLS assets to controllers.
resource "null_resource" "copy-controller-secrets" {
count = "${var.controller_count}"
connection {
@ -9,11 +9,6 @@ resource "null_resource" "copy-secrets" {
timeout = "15m"
}
provisioner "file" {
content = "${module.bootkube.kubeconfig}"
destination = "$HOME/kubeconfig"
}
provisioner "file" {
content = "${module.bootkube.etcd_ca_cert}"
destination = "$HOME/etcd-client-ca.crt"
@ -61,7 +56,6 @@ resource "null_resource" "copy-secrets" {
"sudo mv etcd-peer.key /etc/ssl/etcd/etcd/peer.key",
"sudo chown -R etcd:etcd /etc/ssl/etcd",
"sudo chmod -R 500 /etc/ssl/etcd",
"sudo mv /home/core/kubeconfig /etc/kubernetes/kubeconfig",
]
}
}
@ -69,7 +63,12 @@ resource "null_resource" "copy-secrets" {
# Secure copy bootkube assets to ONE controller and start bootkube to perform
# one-time self-hosted cluster bootstrapping.
resource "null_resource" "bootkube-start" {
depends_on = ["module.bootkube", "null_resource.copy-secrets", "aws_route53_record.apiserver"]
depends_on = [
"module.bootkube",
"module.workers",
"aws_route53_record.apiserver",
"null_resource.copy-controller-secrets",
]
connection {
type = "ssh"
@ -85,7 +84,7 @@ resource "null_resource" "bootkube-start" {
provisioner "remote-exec" {
inline = [
"sudo mv /home/core/assets /opt/bootkube",
"sudo mv $HOME/assets /opt/bootkube",
"sudo systemctl start bootkube",
]
}

View File

@ -1,21 +1,44 @@
variable "cluster_name" {
type = "string"
description = "Cluster name"
description = "Unique cluster name (prepended to dns_zone)"
}
# AWS
variable "dns_zone" {
type = "string"
description = "AWS DNS Zone (e.g. aws.dghubble.io)"
description = "AWS Route53 DNS Zone (e.g. aws.example.com)"
}
variable "dns_zone_id" {
type = "string"
description = "AWS DNS Zone ID (e.g. Z3PAABBCFAKEC0)"
description = "AWS Route53 DNS Zone ID (e.g. Z3PAABBCFAKEC0)"
}
variable "ssh_authorized_key" {
# instances
variable "controller_count" {
type = "string"
description = "SSH public key for user 'core'"
default = "1"
description = "Number of controllers (i.e. masters)"
}
variable "worker_count" {
type = "string"
default = "1"
description = "Number of workers"
}
variable "controller_type" {
type = "string"
default = "t2.small"
description = "EC2 instance type for controllers"
}
variable "worker_type" {
type = "string"
default = "t2.small"
description = "EC2 instance type for workers"
}
variable "os_channel" {
@ -27,41 +50,34 @@ variable "os_channel" {
variable "disk_size" {
type = "string"
default = "40"
description = "The size of the disk in Gigabytes"
description = "Size of the EBS volume in GB"
}
variable "host_cidr" {
description = "CIDR IPv4 range to assign to EC2 nodes"
variable "disk_type" {
type = "string"
default = "10.0.0.0/16"
default = "gp2"
description = "Type of the EBS volume (e.g. standard, gp2, io1)"
}
variable "controller_count" {
variable "controller_clc_snippets" {
type = "list"
description = "Controller Container Linux Config snippets"
default = []
}
variable "worker_clc_snippets" {
type = "list"
description = "Worker Container Linux Config snippets"
default = []
}
# configuration
variable "ssh_authorized_key" {
type = "string"
default = "1"
description = "Number of controllers"
description = "SSH public key for user 'core'"
}
variable "controller_type" {
type = "string"
default = "t2.small"
description = "Controller EC2 instance type"
}
variable "worker_count" {
type = "string"
default = "1"
description = "Number of workers"
}
variable "worker_type" {
type = "string"
default = "t2.small"
description = "Worker EC2 instance type"
}
# bootkube assets
variable "asset_dir" {
description = "Path to a directory where generated assets should be placed (contains secrets)"
type = "string"
@ -79,6 +95,12 @@ variable "network_mtu" {
default = "1480"
}
variable "host_cidr" {
description = "CIDR IPv4 range to assign to EC2 nodes"
type = "string"
default = "10.0.0.0/16"
}
variable "pod_cidr" {
description = "CIDR IPv4 range to assign Kubernetes pods"
type = "string"

View File

@ -1,275 +1,20 @@
# Workers AutoScaling Group
resource "aws_autoscaling_group" "workers" {
name = "${var.cluster_name}-worker ${aws_launch_configuration.worker.name}"
load_balancers = ["${aws_elb.ingress.id}"]
module "workers" {
source = "workers"
name = "${var.cluster_name}"
# count
desired_capacity = "${var.worker_count}"
min_size = "${var.worker_count}"
max_size = "${var.worker_count + 2}"
default_cooldown = 30
health_check_grace_period = 30
# network
vpc_zone_identifier = ["${aws_subnet.public.*.id}"]
# template
launch_configuration = "${aws_launch_configuration.worker.name}"
lifecycle {
# override the default destroy and replace update behavior
create_before_destroy = true
ignore_changes = ["image_id"]
}
tags = [{
key = "Name"
value = "${var.cluster_name}-worker"
propagate_at_launch = true
}]
}
# Worker template
resource "aws_launch_configuration" "worker" {
image_id = "${data.aws_ami.coreos.image_id}"
instance_type = "${var.worker_type}"
user_data = "${data.ct_config.worker_ign.rendered}"
# storage
root_block_device {
volume_type = "standard"
volume_size = "${var.disk_size}"
}
# network
# AWS
vpc_id = "${aws_vpc.network.id}"
subnet_ids = ["${aws_subnet.public.*.id}"]
security_groups = ["${aws_security_group.worker.id}"]
count = "${var.worker_count}"
instance_type = "${var.worker_type}"
os_channel = "${var.os_channel}"
disk_size = "${var.disk_size}"
lifecycle {
// Override the default destroy and replace update behavior
create_before_destroy = true
}
}
# Worker Container Linux Config
data "template_file" "worker_config" {
template = "${file("${path.module}/cl/worker.yaml.tmpl")}"
vars = {
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
k8s_etcd_service_ip = "${cidrhost(var.service_cidr, 15)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
kubeconfig_ca_cert = "${module.bootkube.ca_cert}"
kubeconfig_kubelet_cert = "${module.bootkube.kubelet_cert}"
kubeconfig_kubelet_key = "${module.bootkube.kubelet_key}"
kubeconfig_server = "${module.bootkube.server}"
}
}
data "ct_config" "worker_ign" {
content = "${data.template_file.worker_config.rendered}"
pretty_print = false
}
# Security Group (instance firewall)
resource "aws_security_group" "worker" {
name = "${var.cluster_name}-worker"
description = "${var.cluster_name} worker security group"
vpc_id = "${aws_vpc.network.id}"
tags = "${map("Name", "${var.cluster_name}-worker")}"
}
resource "aws_security_group_rule" "worker-icmp" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "icmp"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "worker-ssh" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 22
to_port = 22
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "worker-http" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 80
to_port = 80
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "worker-https" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 443
to_port = 443
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "worker-flannel" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "udp"
from_port = 8472
to_port = 8472
source_security_group_id = "${aws_security_group.controller.id}"
}
resource "aws_security_group_rule" "worker-flannel-self" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "udp"
from_port = 8472
to_port = 8472
self = true
}
resource "aws_security_group_rule" "worker-node-exporter" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 9100
to_port = 9100
self = true
}
resource "aws_security_group_rule" "worker-kubelet" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 10250
to_port = 10250
source_security_group_id = "${aws_security_group.controller.id}"
}
resource "aws_security_group_rule" "worker-kubelet-self" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 10250
to_port = 10250
self = true
}
resource "aws_security_group_rule" "worker-kubelet-read" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 10255
to_port = 10255
source_security_group_id = "${aws_security_group.controller.id}"
}
resource "aws_security_group_rule" "worker-kubelet-read-self" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 10255
to_port = 10255
self = true
}
resource "aws_security_group_rule" "ingress-health-self" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 10254
to_port = 10254
self = true
}
resource "aws_security_group_rule" "worker-bgp" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 179
to_port = 179
source_security_group_id = "${aws_security_group.controller.id}"
}
resource "aws_security_group_rule" "worker-bgp-self" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 179
to_port = 179
self = true
}
resource "aws_security_group_rule" "worker-ipip" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = 4
from_port = 0
to_port = 0
source_security_group_id = "${aws_security_group.controller.id}"
}
resource "aws_security_group_rule" "worker-ipip-self" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = 4
from_port = 0
to_port = 0
self = true
}
resource "aws_security_group_rule" "worker-ipip-legacy" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = 94
from_port = 0
to_port = 0
source_security_group_id = "${aws_security_group.controller.id}"
}
resource "aws_security_group_rule" "worker-ipip-legacy-self" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = 94
from_port = 0
to_port = 0
self = true
}
resource "aws_security_group_rule" "worker-egress" {
security_group_id = "${aws_security_group.worker.id}"
type = "egress"
protocol = "-1"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
# configuration
kubeconfig = "${module.bootkube.kubeconfig}"
ssh_authorized_key = "${var.ssh_authorized_key}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
clc_snippets = "${var.worker_clc_snippets}"
}

View File

@ -0,0 +1,19 @@
data "aws_ami" "coreos" {
most_recent = true
owners = ["595879546273"]
filter {
name = "architecture"
values = ["x86_64"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
filter {
name = "name"
values = ["CoreOS-${var.os_channel}-*"]
}
}

View File

@ -42,6 +42,7 @@ systemd:
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
@ -56,7 +57,8 @@ systemd:
--lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni \
--node-labels=node-role.kubernetes.io/node \
--pod-manifest-path=/etc/kubernetes/manifests
--pod-manifest-path=/etc/kubernetes/manifests \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
RestartSec=5
@ -81,29 +83,14 @@ storage:
mode: 0644
contents:
inline: |
apiVersion: v1
kind: Config
clusters:
- name: local
cluster:
server: ${kubeconfig_server}
certificate-authority-data: ${kubeconfig_ca_cert}
users:
- name: kubelet
user:
client-certificate-data: ${kubeconfig_kubelet_cert}
client-key-data: ${kubeconfig_kubelet_key}
contexts:
- context:
cluster: local
user: kubelet
${kubeconfig}
- path: /etc/kubernetes/kubelet.env
filesystem: root
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.3
KUBELET_IMAGE_TAG=v1.10.0
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -121,7 +108,7 @@ storage:
--volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \
--insecure-options=image \
docker://gcr.io/google_containers/hyperkube:v1.9.3 \
docker://gcr.io/google_containers/hyperkube:v1.10.0 \
--net=host \
--dns=host \
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)

View File

@ -0,0 +1,82 @@
# Network Load Balancer for Ingress
resource "aws_lb" "ingress" {
name = "${var.name}-ingress"
load_balancer_type = "network"
internal = false
subnets = ["${var.subnet_ids}"]
enable_cross_zone_load_balancing = true
}
# Forward HTTP traffic to workers
resource "aws_lb_listener" "ingress-http" {
load_balancer_arn = "${aws_lb.ingress.arn}"
protocol = "TCP"
port = 80
default_action {
type = "forward"
target_group_arn = "${aws_lb_target_group.workers-http.arn}"
}
}
# Forward HTTPS traffic to workers
resource "aws_lb_listener" "ingress-https" {
load_balancer_arn = "${aws_lb.ingress.arn}"
protocol = "TCP"
port = 443
default_action {
type = "forward"
target_group_arn = "${aws_lb_target_group.workers-https.arn}"
}
}
# Network Load Balancer target groups of instances
resource "aws_lb_target_group" "workers-http" {
name = "${var.name}-workers-http"
vpc_id = "${var.vpc_id}"
target_type = "instance"
protocol = "TCP"
port = 80
# Ingress Controller HTTP health check
health_check {
protocol = "HTTP"
port = 10254
path = "/healthz"
# NLBs required to use same healthy and unhealthy thresholds
healthy_threshold = 3
unhealthy_threshold = 3
# Interval between health checks required to be 10 or 30
interval = 10
}
}
resource "aws_lb_target_group" "workers-https" {
name = "${var.name}-workers-https"
vpc_id = "${var.vpc_id}"
target_type = "instance"
protocol = "TCP"
port = 443
# Ingress Controller HTTP health check
health_check {
protocol = "HTTP"
port = 10254
path = "/healthz"
# NLBs required to use same healthy and unhealthy thresholds
healthy_threshold = 3
unhealthy_threshold = 3
# Interval between health checks required to be 10 or 30
interval = 10
}
}

View File

@ -0,0 +1,4 @@
output "ingress_dns_name" {
value = "${aws_lb.ingress.dns_name}"
description = "DNS name of the network load balancer for distributing traffic to Ingress controllers"
}

View File

@ -0,0 +1,87 @@
variable "name" {
type = "string"
description = "Unique name for the worker pool"
}
# AWS
variable "vpc_id" {
type = "string"
description = "Must be set to `vpc_id` output by cluster"
}
variable "subnet_ids" {
type = "list"
description = "Must be set to `subnet_ids` output by cluster"
}
variable "security_groups" {
type = "list"
description = "Must be set to `worker_security_groups` output by cluster"
}
# instances
variable "count" {
type = "string"
default = "1"
description = "Number of instances"
}
variable "instance_type" {
type = "string"
default = "t2.small"
description = "EC2 instance type"
}
variable "os_channel" {
type = "string"
default = "stable"
description = "Container Linux AMI channel (stable, beta, alpha)"
}
variable "disk_size" {
type = "string"
default = "40"
description = "Size of the EBS volume in GB"
}
variable "disk_type" {
type = "string"
default = "gp2"
description = "Type of the EBS volume (e.g. standard, gp2, io1)"
}
variable "clc_snippets" {
type = "list"
description = "Container Linux Config snippets"
default = []
}
# configuration
variable "kubeconfig" {
type = "string"
description = "Must be set to `kubeconfig` output by cluster"
}
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key for user 'core'"
}
variable "service_cidr" {
description = <<EOD
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
EOD
type = "string"
default = "10.3.0.0/16"
}
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by kube-dns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = "string"
default = "cluster.local"
}

View File

@ -0,0 +1,75 @@
# Workers AutoScaling Group
resource "aws_autoscaling_group" "workers" {
name = "${var.name}-worker ${aws_launch_configuration.worker.name}"
# count
desired_capacity = "${var.count}"
min_size = "${var.count}"
max_size = "${var.count + 2}"
default_cooldown = 30
health_check_grace_period = 30
# network
vpc_zone_identifier = ["${var.subnet_ids}"]
# template
launch_configuration = "${aws_launch_configuration.worker.name}"
# target groups to which instances should be added
target_group_arns = [
"${aws_lb_target_group.workers-http.id}",
"${aws_lb_target_group.workers-https.id}",
]
lifecycle {
# override the default destroy and replace update behavior
create_before_destroy = true
}
tags = [{
key = "Name"
value = "${var.name}-worker"
propagate_at_launch = true
}]
}
# Worker template
resource "aws_launch_configuration" "worker" {
image_id = "${data.aws_ami.coreos.image_id}"
instance_type = "${var.instance_type}"
user_data = "${data.ct_config.worker_ign.rendered}"
# storage
root_block_device {
volume_type = "${var.disk_type}"
volume_size = "${var.disk_size}"
}
# network
security_groups = ["${var.security_groups}"]
lifecycle {
// Override the default destroy and replace update behavior
create_before_destroy = true
ignore_changes = ["image_id"]
}
}
# Worker Container Linux Config
data "template_file" "worker_config" {
template = "${file("${path.module}/cl/worker.yaml.tmpl")}"
vars = {
kubeconfig = "${indent(10, var.kubeconfig)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
}
}
data "ct_config" "worker_ign" {
content = "${data.template_file.worker_config.rendered}"
pretty_print = false
snippets = ["${var.clc_snippets}"]
}

View File

@ -11,10 +11,10 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.9.3 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Kubernetes v1.10.0 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Ready for Ingress, Dashboards, Metrics, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
## Docs

View File

@ -1,6 +1,6 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=203b90169ead2380f74cc64ea1f02c109806c9bc"
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=5f3546b66ffb9946b36e612537bb6a1830ae7746"
cluster_name = "${var.cluster_name}"
api_servers = ["${var.k8s_domain_name}"]

View File

@ -7,7 +7,7 @@ systemd:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.2.15"
Environment="ETCD_IMAGE_TAG=v3.3.2"
Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${domain_name}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${domain_name}:2380"
@ -90,6 +90,7 @@ systemd:
--lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni \
--node-labels=node-role.kubernetes.io/master \
--node-labels=node-role.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
@ -117,7 +118,7 @@ storage:
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.3
KUBELET_IMAGE_TAG=v1.10.0
- path: /etc/hostname
filesystem: root
mode: 0644
@ -144,7 +145,7 @@ storage:
# Move experimental manifests
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.10.0}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.11.0}"
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
exec /usr/bin/rkt run \
--trust-keys-from-https \

View File

@ -82,7 +82,7 @@ storage:
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.3
KUBELET_IMAGE_TAG=v1.10.0
- path: /etc/hostname
filesystem: root
mode: 0644

View File

@ -1,5 +1,5 @@
# Secure copy etcd TLS assets and kubeconfig to controllers. Activates kubelet.service
resource "null_resource" "copy-etcd-secrets" {
resource "null_resource" "copy-controller-secrets" {
count = "${length(var.controller_names)}"
connection {
@ -61,13 +61,13 @@ resource "null_resource" "copy-etcd-secrets" {
"sudo mv etcd-peer.key /etc/ssl/etcd/etcd/peer.key",
"sudo chown -R etcd:etcd /etc/ssl/etcd",
"sudo chmod -R 500 /etc/ssl/etcd",
"sudo mv /home/core/kubeconfig /etc/kubernetes/kubeconfig",
"sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
]
}
}
# Secure copy kubeconfig to all workers. Activates kubelet.service
resource "null_resource" "copy-kubeconfig" {
resource "null_resource" "copy-worker-secrets" {
count = "${length(var.worker_names)}"
connection {
@ -84,7 +84,7 @@ resource "null_resource" "copy-kubeconfig" {
provisioner "remote-exec" {
inline = [
"sudo mv /home/core/kubeconfig /etc/kubernetes/kubeconfig",
"sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
]
}
}
@ -95,13 +95,16 @@ resource "null_resource" "bootkube-start" {
# Without depends_on, this remote-exec may start before the kubeconfig copy.
# Terraform only does one task at a time, so it would try to bootstrap
# while no Kubelets are running.
depends_on = ["null_resource.copy-etcd-secrets", "null_resource.copy-kubeconfig"]
depends_on = [
"null_resource.copy-controller-secrets",
"null_resource.copy-worker-secrets",
]
connection {
type = "ssh"
host = "${element(var.controller_domains, 0)}"
user = "core"
timeout = "30m"
timeout = "15m"
}
provisioner "file" {
@ -111,7 +114,7 @@ resource "null_resource" "bootkube-start" {
provisioner "remote-exec" {
inline = [
"sudo mv /home/core/assets /opt/bootkube",
"sudo mv $HOME/assets /opt/bootkube",
"sudo systemctl start bootkube",
]
}

View File

@ -1,3 +1,10 @@
variable "cluster_name" {
type = "string"
description = "Unique cluster name"
}
# bare-metal
variable "matchbox_http_endpoint" {
type = "string"
description = "Matchbox HTTP read-only endpoint (e.g. http://matchbox.example.com:8080)"
@ -13,17 +20,7 @@ variable "container_linux_version" {
description = "Container Linux version of the kernel/initrd to PXE or the image to install"
}
variable "cluster_name" {
type = "string"
description = "Cluster name"
}
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key to set as an authorized_key on machines"
}
# Machines
# machines
# Terraform's crude "type system" does not properly support lists of maps so we do this.
variable "controller_names" {
@ -50,13 +47,18 @@ variable "worker_domains" {
type = "list"
}
# bootkube assets
# configuration
variable "k8s_domain_name" {
description = "Controller DNS name which resolves to a controller instance. Workers and kubeconfig's will communicate with this endpoint (e.g. cluster.example.com)"
type = "string"
}
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key for user 'core'"
}
variable "asset_dir" {
description = "Path to a directory where generated assets should be placed (contains secrets)"
type = "string"
@ -75,14 +77,14 @@ variable "network_mtu" {
}
variable "pod_cidr" {
description = "CIDR IP range to assign Kubernetes pods"
description = "CIDR IPv4 range to assign Kubernetes pods"
type = "string"
default = "10.2.0.0/16"
}
variable "service_cidr" {
description = <<EOD
CIDR IP range to assign Kubernetes services.
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
EOD

View File

@ -1,117 +0,0 @@
---
systemd:
units:
- name: docker.service
enable: true
- name: locksmithd.service
mask: true
- name: kubelet.path
enable: true
contents: |
[Unit]
Description=Watch for kubeconfig
[Path]
PathExists=/etc/kubernetes/kubeconfig
[Install]
WantedBy=multi-user.target
- name: wait-for-dns.service
enable: true
contents: |
[Unit]
Description=Wait for DNS entries
Wants=systemd-resolved.service
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/sh -c 'while ! /usr/bin/grep '^[^#[:space:]]' /etc/resolv.conf > /dev/null; do sleep 1; done'
[Install]
RequiredBy=kubelet.service
- name: kubelet.service
contents: |
[Unit]
Description=Kubelet via Hyperkube
Wants=rpc-statd.service
[Service]
EnvironmentFile=/etc/kubernetes/kubelet.env
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/cache/kubelet-pod.uuid \
--volume=resolv,kind=host,source=/etc/resolv.conf \
--mount volume=resolv,target=/etc/resolv.conf \
--volume var-lib-cni,kind=host,source=/var/lib/cni \
--mount volume=var-lib-cni,target=/var/lib/cni \
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
--mount volume=opt-cni-bin,target=/opt/cni/bin \
--volume var-log,kind=host,source=/var/log \
--mount volume=var-log,target=/var/log \
--insecure-options=image"
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
--allow-privileged \
--anonymous-auth=false \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns={{.k8s_dns_service_ip}} \
--cluster_domain={{.cluster_domain_suffix}} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--hostname-override={{.domain_name}} \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni \
--node-labels=node-role.kubernetes.io/node \
--pod-manifest-path=/etc/kubernetes/manifests \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
storage:
{{ if index . "pxe" }}
disks:
- device: /dev/sda
wipe_table: true
partitions:
- label: ROOT
filesystems:
- name: root
mount:
device: "/dev/sda1"
format: "ext4"
create:
force: true
options:
- "-LROOT"
{{end}}
files:
- path: /etc/kubernetes/kubelet.env
filesystem: root
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.3
- path: /etc/hostname
filesystem: root
mode: 0644
contents:
inline:
{{.domain_name}}
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
inline: |
fs.inotify.max_user_watches=16184
passwd:
users:
- name: core
ssh_authorized_keys:
- {{.ssh_authorized_key}}

View File

@ -1,22 +0,0 @@
resource "matchbox_group" "workers" {
count = "${length(var.worker_names)}"
name = "${format("%s-%s", var.cluster_name, element(var.worker_names, count.index))}"
profile = "${matchbox_profile.bootkube-worker-pxe.name}"
selector {
mac = "${element(var.worker_macs, count.index)}"
}
metadata {
pxe = "true"
domain_name = "${element(var.worker_domains, count.index)}"
etcd_endpoints = "${join(",", formatlist("%s:2379", var.controller_domains))}"
# TODO
etcd_on_host = "true"
k8s_etcd_service_ip = "10.3.0.15"
k8s_dns_service_ip = "${var.kube_dns_service_ip}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
ssh_authorized_key = "${var.ssh_authorized_key}"
}
}

View File

@ -1,20 +0,0 @@
// Container Linux Install profile (from release.core-os.net)
resource "matchbox_profile" "bootkube-worker-pxe" {
name = "bootkube-worker-pxe"
kernel = "http://${var.container_linux_channel}.release.core-os.net/amd64-usr/${var.container_linux_version}/coreos_production_pxe.vmlinuz"
initrd = [
"http://${var.container_linux_channel}.release.core-os.net/amd64-usr/${var.container_linux_version}/coreos_production_pxe_image.cpio.gz",
]
args = [
"initrd=coreos_production_pxe_image.cpio.gz",
"coreos.config.url=${var.matchbox_http_endpoint}/ignition?uuid=$${uuid}&mac=$${mac:hexhyp}",
"coreos.first_boot=yes",
"console=tty0",
"console=ttyS0",
"${var.kernel_args}",
]
container_linux_config = "${file("${path.module}/cl/bootkube-worker.yaml.tmpl")}"
}

View File

@ -1,22 +0,0 @@
# Secure copy kubeconfig to all nodes to activate kubelet.service
resource "null_resource" "copy-kubeconfig" {
count = "${length(var.worker_names)}"
connection {
type = "ssh"
host = "${element(var.worker_domains, count.index)}"
user = "core"
timeout = "60m"
}
provisioner "file" {
content = "${var.kubeconfig}"
destination = "$HOME/kubeconfig"
}
provisioner "remote-exec" {
inline = [
"sudo mv /home/core/kubeconfig /etc/kubernetes/kubeconfig",
]
}
}

View File

@ -1,72 +0,0 @@
variable "cluster_name" {
description = "Cluster name"
type = "string"
}
variable "matchbox_http_endpoint" {
type = "string"
description = "Matchbox HTTP read-only endpoint (e.g. http://matchbox.example.com:8080)"
}
variable "container_linux_channel" {
type = "string"
description = "Container Linux channel corresponding to the container_linux_version"
}
variable "container_linux_version" {
type = "string"
description = "Container Linux version of the kernel/initrd to PXE or the image to install"
}
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key to set as an authorized key"
}
# machines
# Terraform's crude "type system" does properly support lists of maps so we do this.
variable "controller_domains" {
type = "list"
}
variable "worker_names" {
type = "list"
}
variable "worker_macs" {
type = "list"
}
variable "worker_domains" {
type = "list"
}
# bootkube
variable "kubeconfig" {
type = "string"
}
variable "kube_dns_service_ip" {
description = "Kubernetes service IP for kube-dns (must be within server_cidr)"
type = "string"
default = "10.3.0.10"
}
# optional
variable "kernel_args" {
description = "Additional kernel arguments to provide at PXE boot."
type = "list"
default = [
"root=/dev/sda1",
]
}
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by kube-dns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = "string"
default = "cluster.local"
}

View File

@ -11,10 +11,10 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.9.3 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* Kubernetes v1.10.0 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Ready for Ingress, Dashboards, Metrics, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
## Docs

View File

@ -1,12 +1,12 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=203b90169ead2380f74cc64ea1f02c109806c9bc"
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=5f3546b66ffb9946b36e612537bb6a1830ae7746"
cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
etcd_servers = "${digitalocean_record.etcds.*.fqdn}"
asset_dir = "${var.asset_dir}"
networking = "${var.networking}"
networking = "flannel"
network_mtu = 1440
pod_cidr = "${var.pod_cidr}"
service_cidr = "${var.service_cidr}"

View File

@ -7,7 +7,7 @@ systemd:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.2.15"
Environment="ETCD_IMAGE_TAG=v3.3.2"
Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
@ -77,6 +77,7 @@ systemd:
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
@ -92,8 +93,10 @@ systemd:
--lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni \
--node-labels=node-role.kubernetes.io/master \
--node-labels=node-role.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
RestartSec=10
@ -120,7 +123,7 @@ storage:
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.3
KUBELET_IMAGE_TAG=v1.10.0
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -141,7 +144,7 @@ storage:
# Move experimental manifests
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.10.0}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.11.0}"
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
exec /usr/bin/rkt run \
--trust-keys-from-https \

View File

@ -53,6 +53,7 @@ systemd:
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
@ -68,7 +69,8 @@ systemd:
--lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni \
--node-labels=node-role.kubernetes.io/node \
--pod-manifest-path=/etc/kubernetes/manifests
--pod-manifest-path=/etc/kubernetes/manifests \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
RestartSec=5
@ -94,7 +96,7 @@ storage:
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.3
KUBELET_IMAGE_TAG=v1.10.0
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -112,7 +114,7 @@ storage:
--volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \
--insecure-options=image \
docker://gcr.io/google_containers/hyperkube:v1.9.3 \
docker://gcr.io/google_containers/hyperkube:v1.10.0 \
--net=host \
--dns=host \
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)

View File

@ -45,7 +45,7 @@ resource "digitalocean_droplet" "controllers" {
private_networking = true
user_data = "${element(data.ct_config.controller_ign.*.rendered, count.index)}"
ssh_keys = "${var.ssh_fingerprints}"
ssh_keys = ["${var.ssh_fingerprints}"]
tags = [
"${digitalocean_tag.controllers.id}",
@ -90,4 +90,6 @@ data "ct_config" "controller_ign" {
count = "${var.controller_count}"
content = "${element(data.template_file.controller_config.*.rendered, count.index)}"
pretty_print = false
snippets = ["${var.controller_clc_snippets}"]
}

View File

@ -1,10 +1,10 @@
# Secure copy kubeconfig to all nodes. Activates kubelet.service
resource "null_resource" "copy-secrets" {
count = "${var.controller_count + var.worker_count}"
# Secure copy etcd TLS assets and kubeconfig to controllers. Activates kubelet.service
resource "null_resource" "copy-controller-secrets" {
count = "${var.controller_count}"
connection {
type = "ssh"
host = "${element(concat(digitalocean_droplet.controllers.*.ipv4_address, digitalocean_droplet.workers.*.ipv4_address), count.index)}"
host = "${element(concat(digitalocean_droplet.controllers.*.ipv4_address), count.index)}"
user = "core"
timeout = "15m"
}
@ -61,7 +61,30 @@ resource "null_resource" "copy-secrets" {
"sudo mv etcd-peer.key /etc/ssl/etcd/etcd/peer.key",
"sudo chown -R etcd:etcd /etc/ssl/etcd",
"sudo chmod -R 500 /etc/ssl/etcd",
"sudo mv /home/core/kubeconfig /etc/kubernetes/kubeconfig",
"sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
]
}
}
# Secure copy kubeconfig to all workers. Activates kubelet.service.
resource "null_resource" "copy-worker-secrets" {
count = "${var.worker_count}"
connection {
type = "ssh"
host = "${element(concat(digitalocean_droplet.workers.*.ipv4_address), count.index)}"
user = "core"
timeout = "15m"
}
provisioner "file" {
content = "${module.bootkube.kubeconfig}"
destination = "$HOME/kubeconfig"
}
provisioner "remote-exec" {
inline = [
"sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
]
}
}
@ -69,7 +92,11 @@ resource "null_resource" "copy-secrets" {
# Secure copy bootkube assets to ONE controller and start bootkube to perform
# one-time self-hosted cluster bootstrapping.
resource "null_resource" "bootkube-start" {
depends_on = ["module.bootkube", "null_resource.copy-secrets"]
depends_on = [
"module.bootkube",
"null_resource.copy-controller-secrets",
"null_resource.copy-worker-secrets",
]
connection {
type = "ssh"
@ -85,7 +112,7 @@ resource "null_resource" "bootkube-start" {
provisioner "remote-exec" {
inline = [
"sudo mv /home/core/assets /opt/bootkube",
"sudo mv $HOME/assets /opt/bootkube",
"sudo systemctl start bootkube",
]
}

View File

@ -1,8 +1,10 @@
variable "cluster_name" {
type = "string"
description = "Unique cluster name"
description = "Unique cluster name (prepended to dns_zone)"
}
# Digital Ocean
variable "region" {
type = "string"
description = "Digital Ocean region (e.g. nyc1, sfo2, fra1, tor1)"
@ -13,22 +15,12 @@ variable "dns_zone" {
description = "Digital Ocean domain (i.e. DNS zone) (e.g. do.example.com)"
}
variable "image" {
type = "string"
default = "coreos-stable"
description = "OS image from which to initialize the disk (e.g. coreos-stable)"
}
# instances
variable "controller_count" {
type = "string"
default = "1"
description = "Number of controllers"
}
variable "controller_type" {
type = "string"
default = "s-2vcpu-2gb"
description = "Digital Ocean droplet size (e.g. s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb)."
description = "Number of controllers (i.e. masters)"
}
variable "worker_count" {
@ -37,39 +29,57 @@ variable "worker_count" {
description = "Number of workers"
}
variable "controller_type" {
type = "string"
default = "s-2vcpu-2gb"
description = "Droplet type for controllers (e.g. s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb)."
}
variable "worker_type" {
type = "string"
default = "s-1vcpu-1gb"
description = "Digital Ocean droplet size (e.g. s-1vcpu-1gb, s-1vcpu-2gb, s-2vcpu-2gb)"
description = "Droplet type for workers (e.g. s-1vcpu-1gb, s-1vcpu-2gb, s-2vcpu-2gb)"
}
variable "image" {
type = "string"
default = "coreos-stable"
description = "Container Linux image for instances (e.g. coreos-stable)"
}
variable "controller_clc_snippets" {
type = "list"
description = "Controller Container Linux Config snippets"
default = []
}
variable "worker_clc_snippets" {
type = "list"
description = "Worker Container Linux Config snippets"
default = []
}
# configuration
variable "ssh_fingerprints" {
type = "list"
description = "SSH public key fingerprints. (e.g. see `ssh-add -l -E md5`)"
}
# bootkube assets
variable "asset_dir" {
description = "Path to a directory where generated assets should be placed (contains secrets)"
type = "string"
}
variable "networking" {
description = "Choice of networking provider (flannel or calico)"
type = "string"
default = "flannel"
}
variable "pod_cidr" {
description = "CIDR IP range to assign Kubernetes pods"
description = "CIDR IPv4 range to assign Kubernetes pods"
type = "string"
default = "10.2.0.0/16"
}
variable "service_cidr" {
description = <<EOD
CIDR IP range to assign Kubernetes services.
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
EOD

View File

@ -26,7 +26,7 @@ resource "digitalocean_droplet" "workers" {
private_networking = true
user_data = "${data.ct_config.worker_ign.rendered}"
ssh_keys = "${var.ssh_fingerprints}"
ssh_keys = ["${var.ssh_fingerprints}"]
tags = [
"${digitalocean_tag.workers.id}",
@ -44,7 +44,6 @@ data "template_file" "worker_config" {
vars = {
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
k8s_etcd_service_ip = "${cidrhost(var.service_cidr, 15)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
}
}
@ -52,4 +51,5 @@ data "template_file" "worker_config" {
data "ct_config" "worker_ign" {
content = "${data.template_file.worker_config.rendered}"
pretty_print = false
snippets = ["${var.worker_clc_snippets}"]
}

View File

@ -18,7 +18,7 @@ kubectl apply -f addons/cluo -R
$ kubectl get nodes --show-labels
...
container-linux-update.v1.coreos.com/group=stable
container-linux-update.v1.coreos.com/version=1576.5.0
container-linux-update.v1.coreos.com/version=1632.3.0
```
`update-operator` ensures one node reboots at a time and that pods are drained prior to reboot.

View File

@ -1,6 +1,130 @@
# Customization
To customize clusters in ways that aren't supported by input variables, fork the repo and make changes to the Terraform module. Stay tuned for improvements to this strategy since it is beneficial to stay close to this upstream.
Typhoon provides minimal Kubernetes clusters with defaults we recommend for production. Terraform variables provide easy to use and supported customizations for clusters. Advanced options are available for customizing the architecture or hosts.
## Variables
Typhoon modules accept Terraform input variables for customizing clusters in meritorious ways (e.g. `worker_count`, etc). Variables are carefully considered to provide essentials, while limiting complexity and test matrix burden. See each platform's tutorial for options.
## Addons
Clusters are kept to a minimal Kubernetes control plane by offering components like Nginx Ingress Controller, Prometheus, Grafana, and Heapster as optional post-install [addons](https://github.com/poseidon/typhoon/tree/master/addons). Customize addons by modifying a copy of our addon manifests.
## Hosts
### Container Linux
!!! danger
Container Linux Configs provide powerful host customization abilities. You are responsible for the additional configs defined for hosts.
Container Linux Configs (CLCs) declare how a Container Linux instance's disk should be provisioned on first boot from disk. CLCs define disk partitions, filesystems, files, systemd units, dropins, networkd configs, mount units, raid arrays, and users. Typhoon creates controller and worker instances with base Container Linux Configs to create a minimal, secure Kubernetes cluster on each platform.
Typhoon AWS, Google Cloud, and Digital Ocean give users the ability to provide CLC *snippets* - valid Container Linux Configs that are validated and additively merged into the Typhoon base config during `terraform plan`. This allows advanced host customizations and experimentation.
#### Examples
Container Linux [docs](https://coreos.com/os/docs/latest/clc-examples.html) show many simple config examples. Ensure a file `/opt/hello` is created with permissions 0644.
```
# custom-files
storage:
files:
- path: /opt/hello
filesystem: root
contents:
inline: |
Hello World
mode: 0644
```
Ensure a systemd unit `hello.service` is created and a dropin `50-etcd-cluster.conf` is added for `etcd-member.service`.
```
# custom-units
systemd:
units:
- name: hello.service
enable: true
contents: |
[Unit]
Description=Hello World
[Service]
Type=oneshot
ExecStart=/usr/bin/echo Hello World!
[Install]
WantedBy=multi-user.target
- name: etcd-member.service
enable: true
dropins:
- name: 50-etcd-cluster.conf
contents: |
Environment="ETCD_LOG_PACKAGE_LEVELS=etcdserver=WARNING,security=DEBUG"
```
#### Specification
View the Container Linux Config [format](https://coreos.com/os/docs/1576.4.0/configuration.html) to read about each field.
#### Usage
Write Container Linux Configs *snippets* as files in the repository where you keep Terraform configs for clusters (perhaps in a `clc` or `snippets` subdirectory). You may organize snippets in multiple files as desired, provided they are each valid.
Define an [AWS](https://typhoon.psdn.io/aws/#cluster), [Google Cloud](https://typhoon.psdn.io/google-cloud/#cluster), or [Digital Ocean](https://typhoon.psdn.io/digital-ocean/#cluster) cluster and fill in the optional `controller_clc_snippets` or `worker_clc_snippets` fields.
```
module "digital-ocean-nemo" {
...
controller_count = 1
worker_count = 2
controller_clc_snippets = [
"${file("./custom-files")}",
"${file("./custom-units")}",
]
worker_clc_snippets = [
"${file("./custom-files")}",
"${file("./custom-units")}",
]
...
}
```
Plan the resources to be created.
```
$ terraform plan
Plan: 54 to add, 0 to change, 0 to destroy.
```
Most syntax errors in CLCs can be caught during planning. For example, mangle the indentation in one of the CLC files:
```
$ terraform plan
...
error parsing Container Linux Config: error: yaml: line 3: did not find expected '-' indicator
```
Undo the mangle. Apply the changes to create the cluster per the tutorial.
```
$ terraform apply
```
Container Linux Configs (and the CoreOS Ignition system) create immutable infrastructure. Disk provisioning is performed only on first boot from disk. That means if you change a snippet used by an instance, Terraform will (correctly) try to destroy and recreate that instance. Be careful!
!!! danger
Destroying and recreating controller instances is destructive! etcd runs on controller instances and stores data there. Do not modify controller snippets. See [blue/green](https://typhoon.psdn.io/topics/maintenance/#upgrades) clusters.
## Architecture
To customize clusters in ways that aren't supported by input variables, fork Typhoon and maintain a repository with customizations. Reference the repository by changing the username.
```
module "digital-ocean-nemo" {
source = "git::https://github.com/USERNAME/typhoon//digital-ocean/container-linux/kubernetes?ref=myspecialcase"
...
}
```
To customize lower-level Kubernetes control plane bootstrapping, see the [poseidon/bootkube-terraform](https://github.com/poseidon/bootkube-terraform) Terraform module.

View File

@ -0,0 +1,6 @@
# Advanced
Typhoon clusters offer several advanced features for skilled users.
* [Customization](customization.md)
* [Worker Pools](worker-pools.md)

View File

@ -0,0 +1,149 @@
# Worker Pools
Typhoon AWS and Google Cloud allow additional groups of workers to be defined and joined to a cluster. For example, add worker pools of instances with different types, disk sizes, Container Linux channels, or preemptibility modes.
Internal Terraform Modules:
* `aws/container-linux/kubernetes/workers`
* `google-cloud/container-linux/kubernetes/workers`
## AWS
Create a cluster following the AWS [tutorial](../aws.md#cluster). Define a worker pool using the AWS internal `workers` module.
```tf
module "tempest-worker-pool" {
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes/workers?ref=v1.10.0"
providers = {
aws = "aws.default"
}
# AWS
vpc_id = "${module.aws-tempest.vpc_id}"
subnet_ids = "${module.aws-tempest.subnet_ids}"
security_groups = "${module.aws-tempest.worker_security_groups}"
# configuration
name = "tempest-worker-pool"
kubeconfig = "${module.aws-tempest.kubeconfig}"
ssh_authorized_key = "${var.ssh_authorized_key}"
count = 2
instance_type = "m5.large"
os_channel = "beta"
}
```
Apply the change.
```
terraform apply
```
Verify an auto-scaling group of workers join the cluster within a few minutes.
### Variables
The AWS internal `workers` module supports a number of [variables](https://github.com/poseidon/typhoon/blob/master/aws/container-linux/kubernetes/workers/variables.tf).
#### Required
| Name | Description | Example |
|:-----|:------------|:--------|
| vpc_id | Must be set to `vpc_id` output by cluster | "${module.cluster.vpc_id}" |
| subnet_ids | Must be set to `subnet_ids` output by cluster | "${module.cluster.subnet_ids}" |
| security_groups | Must be set to `worker_security_groups` output by cluster | "${module.cluster.worker_security_groups}" |
| name | Unique name (distinct from cluster name) | "tempest-m5s" |
| kubeconfig | Must be set to `kubeconfig` output by cluster | "${module.cluster.kubeconfig}" |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." |
#### Optional
| Name | Description | Default | Example |
|:-----|:------------|:--------|:--------|
| count | Number of instances | 1 | 3 |
| instance_type | EC2 instance type | "t2.small" | "t2.medium" |
| os_channel | Container Linux AMI channel | stable| "beta", "alpha" |
| disk_size | Size of the disk in GB | 40 | 100 |
| service_cidr | Must match `service_cidr` of cluster | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | Must match `cluster_domain_suffix` of cluster | "cluster.local" | "k8s.example.com" |
Check the list of valid [instance types](https://aws.amazon.com/ec2/instance-types/).
## Google Cloud
Create a cluster following the Google Cloud [tutorial](../google-cloud.md#cluster). Define a worker pool using the Google Cloud internal `workers` module.
```tf
module "yavin-worker-pool" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes/workers?ref=v1.10.0"
providers = {
google = "google.default"
}
# Google Cloud
region = "us-central1"
network = "${module.google-cloud-yavin.network_name}"
cluster_name = "yavin"
# configuration
name = "yavin-16x"
kubeconfig = "${module.google-cloud-yavin.kubeconfig}"
ssh_authorized_key = "${var.ssh_authorized_key}"
count = 2
machine_type = "n1-standard-16"
os_image = "coreos-beta"
preemptible = true
}
```
Apply the change.
```
terraform apply
```
Verify a managed instance group of workers joins the cluster within a few minutes.
```
$ kubectl get nodes
NAME STATUS AGE VERSION
yavin-controller-0.c.example-com.internal Ready 6m v1.10.0
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.10.0
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.10.0
yavin-16x-worker-jrbf.c.example-com.internal Ready 3m v1.10.0
yavin-16x-worker-mzdm.c.example-com.internal Ready 3m v1.10.0
```
### Variables
The Google Cloud internal `workers` module supports a number of [variables](https://github.com/poseidon/typhoon/blob/master/google-cloud/container-linux/kubernetes/workers/variables.tf).
#### Required
| Name | Description | Example |
|:-----|:------------|:--------|
| region | Must be set to `region` of cluster | "us-central1" |
| network | Must be set to `network_name` output by cluster | "${module.cluster.network_name}" |
| name | Unique name (distinct from cluster name) | "yavin-16x" |
| cluster_name | Must be set to `cluster_name` of cluster | "yavin" |
| kubeconfig | Must be set to `kubeconfig` output by cluster | "${module.cluster.kubeconfig}" |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." |
#### Optional
| Name | Description | Default | Example |
|:-----|:------------|:--------|:--------|
| count | Number of instances | 1 | 3 |
| machine_type | Compute instance machine type | "n1-standard-1" | See below |
| os_image | Container Linux image for compute instances | "coreos-stable" | "coreos-alpha", "coreos-beta" |
| disk_size | Size of the disk in GB | 40 | 100 |
| preemptible | If true, Compute Engine will terminate instances randomly within 24 hours | false | true |
| service_cidr | Must match `service_cidr` of cluster | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | Must match `cluster_domain_suffix` of cluster | "cluster.local" | "k8s.example.com" |
Check the list of valid [machine types](https://cloud.google.com/compute/docs/machine-types).

View File

@ -1,6 +1,6 @@
# AWS
In this tutorial, we'll create a Kubernetes v1.9.3 cluster on AWS.
In this tutorial, we'll create a Kubernetes v1.10.0 cluster on AWS.
We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, a VPC, gateway, subnets, auto-scaling groups of controllers and workers, network load balancers for controllers and workers, and security groups will be created.
@ -24,9 +24,9 @@ Terraform v0.11.1
Add the [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) plugin binary for your system.
```sh
wget https://github.com/coreos/terraform-provider-ct/releases/download/v0.2.0/terraform-provider-ct-v0.2.0-linux-amd64.tar.gz
tar xzf terraform-provider-ct-v0.2.0-linux-amd64.tar.gz
sudo mv terraform-provider-ct-v0.2.0-linux-amd64/terraform-provider-ct /usr/local/bin/
wget https://github.com/coreos/terraform-provider-ct/releases/download/v0.2.1/terraform-provider-ct-v0.2.1-linux-amd64.tar.gz
tar xzf terraform-provider-ct-v0.2.1-linux-amd64.tar.gz
sudo mv terraform-provider-ct-v0.2.1-linux-amd64/terraform-provider-ct /usr/local/bin/
```
Add the plugin to your `~/.terraformrc`.
@ -57,7 +57,7 @@ Configure the AWS provider to use your access key credentials in a `providers.tf
```tf
provider "aws" {
version = "~> 1.5.0"
version = "~> 1.11.0"
alias = "default"
region = "eu-central-1"
@ -96,7 +96,7 @@ Define a Kubernetes cluster using the module `aws/container-linux/kubernetes`.
```tf
module "aws-tempest" {
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes?ref=v1.9.3"
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes?ref=v1.10.0"
providers = {
aws = "aws.default"
@ -105,20 +105,19 @@ module "aws-tempest" {
template = "template.default"
tls = "tls.default"
}
cluster_name = "tempest"
# AWS
dns_zone = "aws.example.com"
dns_zone_id = "Z3PAABBCFAKEC0"
controller_count = 1
controller_type = "t2.medium"
worker_count = 2
worker_type = "t2.small"
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
cluster_name = "tempest"
dns_zone = "aws.example.com"
dns_zone_id = "Z3PAABBCFAKEC0"
# bootkube
asset_dir = "/home/user/.secrets/clusters/tempest"
# configuration
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
asset_dir = "/home/user/.secrets/clusters/tempest"
# optional
worker_count = 2
worker_type = "t2.medium"
}
```
@ -150,7 +149,7 @@ Get or update Terraform modules.
$ terraform get # downloads missing modules
$ terraform get --update # updates all modules
Get: git::https://github.com/poseidon/typhoon (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.10.0 (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.11.0 (update)
```
Plan the resources to be created.
@ -182,9 +181,9 @@ In 4-8 minutes, the Kubernetes cluster will be ready.
$ export KUBECONFIG=/home/user/.secrets/clusters/tempest/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
ip-10-0-12-221 Ready 34m v1.9.3
ip-10-0-19-112 Ready 34m v1.9.3
ip-10-0-4-22 Ready 34m v1.9.3
ip-10-0-12-221 Ready 34m v1.10.0
ip-10-0-19-112 Ready 34m v1.10.0
ip-10-0-4-22 Ready 34m v1.10.0
```
List the pods.
@ -210,10 +209,10 @@ kube-system pod-checkpointer-4kxtl-ip-10-0-12-221 1/1 Running 0
## Going Further
Learn about [version pinning](concepts.md#versioning), [maintenance](topics/maintenance.md), and [addons](addons/overview.md).
Learn about [maintenance](topics/maintenance.md) and [addons](addons/overview.md).
!!! note
On Container Linux clusters, install the `container-linux-update-operator` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
On Container Linux clusters, install the `CLUO` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
## Variables
@ -225,7 +224,6 @@ Learn about [version pinning](concepts.md#versioning), [maintenance](topics/main
| dns_zone | AWS Route53 DNS zone | "aws.example.com" |
| dns_zone_id | AWS Route53 DNS zone id | "Z3PAABBCFAKEC0" |
| ssh_authorized_key | SSH public key for ~/.ssh_authorized_keys | "ssh-rsa AAAAB3NZ..." |
| os_channel | Container Linux AMI channel | stable, beta, alpha |
| asset_dir | Path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/tempest" |
#### DNS Zone
@ -250,15 +248,19 @@ Reference the DNS zone id with `"${aws_route53_zone.zone-for-clusters.zone_id}"`
| Name | Description | Default | Example |
|:-----|:------------|:--------|:--------|
| controller_count | Number of controllers (i.e. masters) | 1 | 1 |
| controller_type | Controller EC2 instance type | "t2.small" | "t2.medium" |
| worker_count | Number of workers | 1 | 3 |
| worker_type | Worker EC2 instance type | "t2.small" | "t2.medium" |
| controller_type | EC2 instance type for controllers | "t2.small" | See below |
| worker_type | EC2 instance type for workers | "t2.small" | See below |
| os_channel | Container Linux AMI channel | stable | stable, beta, alpha |
| disk_size | Size of the EBS volume in GB | "40" | "100" |
| disk_type | Type of the EBS volume | "gp2" | standard, gp2, io1 |
| controller_clc_snippets | Controller Container Linux Config snippets | [] | |
| worker_clc_snippets | Worker Container Linux Config snippets | [] | |
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
| network_mtu | CNI interface MTU (calico only) | 1480 | 8981 |
| host_cidr | CIDR range to assign to EC2 instances | "10.0.0.0/16" | "10.1.0.0/16" |
| pod_cidr | CIDR range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| host_cidr | CIDR IPv4 range to assign to EC2 instances | "10.0.0.0/16" | "10.1.0.0/16" |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by kube-dns. | "cluster.local" | "k8s.example.com" |
Check the list of valid [instance types](https://aws.amazon.com/ec2/instance-types/).

View File

@ -1,6 +1,6 @@
# Bare-Metal
In this tutorial, we'll network boot and provision a Kubernetes v1.9.3 cluster on bare-metal.
In this tutorial, we'll network boot and provision a Kubernetes v1.10.0 cluster on bare-metal.
First, we'll deploy a [Matchbox](https://github.com/coreos/matchbox) service and setup a network boot environment. Then, we'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Container Linux to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers.
@ -177,7 +177,7 @@ Define a Kubernetes cluster using the module `bare-metal/container-linux/kuberne
```tf
module "bare-metal-mercury" {
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.9.3"
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.10.0"
providers = {
local = "local.default"
@ -186,15 +186,16 @@ module "bare-metal-mercury" {
tls = "tls.default"
}
# install
# bare-metal
cluster_name = "mercury"
matchbox_http_endpoint = "http://matchbox.example.com"
container_linux_channel = "stable"
container_linux_version = "1576.5.0"
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
container_linux_version = "1632.3.0"
# cluster
cluster_name = "mercury"
k8s_domain_name = "node1.example.com"
# configuration
k8s_domain_name = "node1.example.com"
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
asset_dir = "/home/user/.secrets/clusters/mercury"
# machines
controller_names = ["node1"]
@ -212,9 +213,6 @@ module "bare-metal-mercury" {
"node2.example.com",
"node3.example.com",
]
# output assets dir
asset_dir = "/home/user/.secrets/clusters/mercury"
}
```
@ -246,7 +244,7 @@ Get or update Terraform modules.
$ terraform get # downloads missing modules
$ terraform get --update # updates all modules
Get: git::https://github.com/poseidon/typhoon (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.10.0 (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.11.0 (update)
```
Plan the resources to be created.
@ -259,6 +257,7 @@ Plan: 55 to add, 0 to change, 0 to destroy.
Apply the changes. Terraform will generate bootkube assets to `asset_dir` and create Matchbox profiles (e.g. controller, worker) and matching rules via the Matchbox API.
```sh
$ terraform apply
module.bare-metal-mercury.null_resource.copy-kubeconfig.0: Provisioning with 'file'...
module.bare-metal-mercury.null_resource.copy-etcd-secrets.0: Provisioning with 'file'...
module.bare-metal-mercury.null_resource.copy-kubeconfig.0: Still creating... (10s elapsed)
@ -317,9 +316,9 @@ bootkube[5]: Tearing down temporary bootstrap control plane...
$ export KUBECONFIG=/home/user/.secrets/clusters/mercury/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
node1.example.com Ready 11m v1.9.3
node2.example.com Ready 11m v1.9.3
node3.example.com Ready 11m v1.9.3
node1.example.com Ready 11m v1.10.0
node2.example.com Ready 11m v1.10.0
node3.example.com Ready 11m v1.10.0
```
List the pods.
@ -334,7 +333,6 @@ kube-system kube-apiserver-7336w 1/1 Running 0
kube-system kube-controller-manager-3271970485-b9chx 1/1 Running 0 11m
kube-system kube-controller-manager-3271970485-v30js 1/1 Running 1 11m
kube-system kube-dns-1187388186-mx9rt 3/3 Running 0 11m
kube-system kube-etcd-network-checkpointer-q24f7 1/1 Running 0 11m
kube-system kube-proxy-50sd4 1/1 Running 0 11m
kube-system kube-proxy-bczhp 1/1 Running 0 11m
kube-system kube-proxy-mp2fw 1/1 Running 0 11m
@ -346,10 +344,10 @@ kube-system pod-checkpointer-wf65d-node1.example.com 1/1 Running 0
## Going Further
Learn about [version pinning](concepts.md#versioning), [maintenance](topics/maintenance.md), and [addons](addons/overview.md).
Learn about [maintenance](topics/maintenance.md) and [addons](addons/overview.md).
!!! note
On Container Linux clusters, install the `container-linux-update-operator` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
On Container Linux clusters, install the `CLUO` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
## Variables
@ -357,19 +355,19 @@ Learn about [version pinning](concepts.md#versioning), [maintenance](topics/main
| Name | Description | Example |
|:-----|:------------|:--------|
| cluster_name | Unique cluster name | mercury |
| matchbox_http_endpoint | Matchbox HTTP read-only endpoint | http://matchbox.example.com:8080 |
| container_linux_channel | Container Linux channel | stable, beta, alpha |
| container_linux_version | Container Linux version of the kernel/initrd to PXE and the image to install | 1576.5.0 |
| cluster_name | Cluster name | mercury |
| container_linux_version | Container Linux version of the kernel/initrd to PXE and the image to install | 1632.3.0 |
| k8s_domain_name | FQDN resolving to the controller(s) nodes. Workers and kubectl will communicate with this endpoint | "myk8s.example.com" |
| ssh_authorized_key | SSH public key for ~/.ssh/authorized_keys | "ssh-rsa AAAAB3Nz..." |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3Nz..." |
| asset_dir | Path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/mercury" |
| controller_names | Ordered list of controller short names | ["node1"] |
| controller_macs | Ordered list of controller identifying MAC addresses | ["52:54:00:a1:9c:ae"] |
| controller_domains | Ordered list of controller FQDNs | ["node1.example.com"] |
| worker_names | Ordered list of worker short names | ["node2", "node3"] |
| worker_macs | Ordered list of worker identifying MAC addresses | ["52:54:00:b2:2f:86", "52:54:00:c3:61:77"] |
| worker_domains | Ordered list of worker FQDNs | ["node2.example.com", "node3.example.com"] |
| asset_dir | Path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/mercury" |
### Optional
@ -380,8 +378,8 @@ Learn about [version pinning](concepts.md#versioning), [maintenance](topics/main
| container_linux_oem | Specify alternative OEM image ids for the disk install | "" | "vmware_raw", "xen" |
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
| network_mtu | CNI interface MTU (calico-only) | 1480 | - |
| pod_cidr | CIDR range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by kube-dns. | "cluster.local" | "k8s.example.com" |
| kernel_args | Additional kernel args to provide at PXE boot | [] | "kvm-intel.nested=1" |

View File

@ -1,6 +1,6 @@
# Digital Ocean
In this tutorial, we'll create a Kubernetes v1.9.3 cluster on Digital Ocean.
In this tutorial, we'll create a Kubernetes v1.10.0 cluster on Digital Ocean.
We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, firewall rules, DNS records, tags, and droplets for Kubernetes controllers and workers will be created.
@ -24,9 +24,9 @@ Terraform v0.11.1
Add the [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) plugin binary for your system.
```sh
wget https://github.com/coreos/terraform-provider-ct/releases/download/v0.2.0/terraform-provider-ct-v0.2.0-linux-amd64.tar.gz
tar xzf terraform-provider-ct-v0.2.0-linux-amd64.tar.gz
sudo mv terraform-provider-ct-v0.2.0-linux-amd64/terraform-provider-ct /usr/local/bin/
wget https://github.com/coreos/terraform-provider-ct/releases/download/v0.2.1/terraform-provider-ct-v0.2.1-linux-amd64.tar.gz
tar xzf terraform-provider-ct-v0.2.1-linux-amd64.tar.gz
sudo mv terraform-provider-ct-v0.2.1-linux-amd64/terraform-provider-ct /usr/local/bin/
```
Add the plugin to your `~/.terraformrc`.
@ -90,7 +90,7 @@ Define a Kubernetes cluster using the module `digital-ocean/container-linux/kube
```tf
module "digital-ocean-nemo" {
source = "git::https://github.com/poseidon/typhoon//digital-ocean/container-linux/kubernetes?ref=v1.9.3"
source = "git::https://github.com/poseidon/typhoon//digital-ocean/container-linux/kubernetes?ref=v1.10.0"
providers = {
digitalocean = "digitalocean.default"
@ -100,19 +100,18 @@ module "digital-ocean-nemo" {
tls = "tls.default"
}
region = "nyc3"
dns_zone = "digital-ocean.example.com"
# Digital Ocean
cluster_name = "nemo"
region = "nyc3"
dns_zone = "digital-ocean.example.com"
cluster_name = "nemo"
image = "coreos-stable"
controller_count = 1
controller_type = "s-2vcpu-2gb"
worker_count = 2
worker_type = "s-1vcpu-1gb"
# configuration
ssh_fingerprints = ["d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7"]
# output assets dir
asset_dir = "/home/user/.secrets/clusters/nemo"
asset_dir = "/home/user/.secrets/clusters/nemo"
# optional
worker_count = 2
worker_type = "s-1vcpu-1gb"
}
```
@ -144,7 +143,7 @@ Get or update Terraform modules.
$ terraform get # downloads missing modules
$ terraform get --update # updates all modules
Get: git::https://github.com/poseidon/typhoon (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.10.0 (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.11.0 (update)
```
Plan the resources to be created.
@ -177,9 +176,9 @@ In 3-6 minutes, the Kubernetes cluster will be ready.
$ export KUBECONFIG=/home/user/.secrets/clusters/nemo/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
10.132.110.130 Ready 10m v1.9.3
10.132.115.81 Ready 10m v1.9.3
10.132.124.107 Ready 10m v1.9.3
10.132.110.130 Ready 10m v1.10.0
10.132.115.81 Ready 10m v1.10.0
10.132.124.107 Ready 10m v1.10.0
```
List the pods.
@ -204,10 +203,10 @@ kube-system pod-checkpointer-pr1lq-10.132.115.81 1/1 Running 0
## Going Further
Learn about [version pinning](concepts.md#versioning), [maintenance](topics/maintenance.md), and [addons](addons/overview.md).
Learn about [maintenance](topics/maintenance.md) and [addons](addons/overview.md).
!!! note
On Container Linux clusters, install the `container-linux-update-operator` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
On Container Linux clusters, install the `CLUO` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
## Variables
@ -243,8 +242,8 @@ resource "digitalocean_domain" "zone-for-clusters" {
DigitalOcean droplets are created with your SSH public key "fingerprint" (i.e. MD5 hash) to allow access. If your SSH public key is at `~/.ssh/id_rsa`, find the fingerprint with,
```bash
ssh-keygen -lf ~/.ssh/id_rsa.pub | awk '{print $2}'
d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7
ssh-keygen -E md5 -lf ~/.ssh/id_rsa.pub | awk '{print $2}'
MD5:d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7
```
If you use `ssh-agent` (e.g. Yubikey for SSH), find the fingerprint with,
@ -254,26 +253,24 @@ ssh-add -l -E md5
2048 MD5:d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7 cardno:000603633110 (RSA)
```
If you uploaded an SSH key to DigitalOcean (not required), find the fingerprint under Settings -> Security. Finally, if you don't have an SSH key, [create one now](https://help.github.com/articles/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent/).
Digital Ocean requires the SSH public key be uploaded to your account, so you may also find the fingerprint under Settings -> Security. Finally, if you don't have an SSH key, [create one now](https://help.github.com/articles/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent/).
### Optional
| Name | Description | Default | Example |
|:-----|:------------|:--------|:--------|
| image | OS image for droplets | "coreos-stable" | coreos-stable, coreos-beta, coreos-alpha |
| controller_count | Number of controllers (i.e. masters) | 1 | 1 |
| controller_type | Digital Ocean droplet size | s-2vcpu-2gb | s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb, ... |
| worker_count | Number of workers | 1 | 3 |
| worker_type | Digital Ocean droplet size | s-1vcpu-1gb | s-1vcpu-1gb, s-1vcpu-2gb, s-2vcpu-2gb, ... |
| networking | Choice of networking provider | "flannel" | "flannel" |
| pod_cidr | CIDR range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| controller_type | Droplet type for controllers | s-2vcpu-2gb | s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb, ... |
| worker_type | Droplet type for workers | s-1vcpu-1gb | s-1vcpu-1gb, s-1vcpu-2gb, s-2vcpu-2gb, ... |
| image | Container Linux image for instances | "coreos-stable" | coreos-stable, coreos-beta, coreos-alpha |
| controller_clc_snippets | Controller Container Linux Config snippets | [] | |
| worker_clc_snippets | Worker Container Linux Config snippets | [] | |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by kube-dns. | "cluster.local" | "k8s.example.com" |
You can see all valid droplet sizes [on DigitalOcean's website](https://developers.digitalocean.com/documentation/changelog/api-v2/new-size-slugs-for-droplet-plan-changes/) or by [using their `doctl` command-line tool](https://github.com/digitalocean/doctl) via `doctl compute size list`.
Check the list of valid [droplet types](https://developers.digitalocean.com/documentation/changelog/api-v2/new-size-slugs-for-droplet-plan-changes/) or use `doctl compute size list`.
!!! warning
Do not choose a `controller_type` smaller than 2GB. Smaller droplets are not sufficient for running a controller and bootstrapping will fail.
!!! bug
Digital Ocean firewalls do not yet support the IP tunneling (IP in IP) protocol used by Calico. You can try using "calico" for `networking`, but it will only work if the cloud firewall is removed (unsafe).

View File

@ -1,6 +1,6 @@
# Google Cloud
In this tutorial, we'll create a Kubernetes v1.9.3 cluster on Google Compute Engine (not GKE).
In this tutorial, we'll create a Kubernetes v1.10.0 cluster on Google Compute Engine (not GKE).
We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, a network, firewall rules, managed instance groups of Kubernetes controllers and workers, network load balancers for controllers and workers, and health checks will be created.
@ -24,9 +24,9 @@ Terraform v0.11.1
Add the [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) plugin binary for your system.
```sh
wget https://github.com/coreos/terraform-provider-ct/releases/download/v0.2.0/terraform-provider-ct-v0.2.0-linux-amd64.tar.gz
tar xzf terraform-provider-ct-v0.2.0-linux-amd64.tar.gz
sudo mv terraform-provider-ct-v0.2.0-linux-amd64/terraform-provider-ct /usr/local/bin/
wget https://github.com/coreos/terraform-provider-ct/releases/download/v0.2.1/terraform-provider-ct-v0.2.1-linux-amd64.tar.gz
tar xzf terraform-provider-ct-v0.2.1-linux-amd64.tar.gz
sudo mv terraform-provider-ct-v0.2.1-linux-amd64/terraform-provider-ct /usr/local/bin/
```
Add the plugin to your `~/.terraformrc`.
@ -57,7 +57,7 @@ Configure the Google Cloud provider to use your service account key, project-id,
```tf
provider "google" {
version = "1.2"
version = "1.6"
alias = "default"
credentials = "${file("~/.config/google-cloud/terraform.json")}"
@ -97,29 +97,28 @@ Define a Kubernetes cluster using the module `google-cloud/container-linux/kuber
```tf
module "google-cloud-yavin" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.9.3"
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.10.0"
providers = {
google = "google.default"
local = "local.default"
null = "null.default"
google = "google.default"
local = "local.default"
null = "null.default"
template = "template.default"
tls = "tls.default"
tls = "tls.default"
}
# Google Cloud
cluster_name = "yavin"
region = "us-central1"
dns_zone = "example.com"
dns_zone_name = "example-zone"
os_image = "coreos-stable-1576-5-0-v20180105"
cluster_name = "yavin"
controller_count = 1
worker_count = 2
# configuration
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
# output assets dir
asset_dir = "/home/user/.secrets/clusters/yavin"
asset_dir = "/home/user/.secrets/clusters/yavin"
# optional
worker_count = 2
}
```
@ -151,7 +150,7 @@ Get or update Terraform modules.
$ terraform get # downloads missing modules
$ terraform get --update # updates all modules
Get: git::https://github.com/poseidon/typhoon (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.10.0 (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.11.0 (update)
```
Plan the resources to be created.
@ -185,9 +184,9 @@ In 4-8 minutes, the Kubernetes cluster will be ready.
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
yavin-controller-0.c.example-com.internal Ready 6m v1.9.3
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.9.3
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.9.3
yavin-controller-0.c.example-com.internal Ready 6m v1.10.0
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.10.0
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.10.0
```
List the pods.
@ -212,10 +211,10 @@ kube-system pod-checkpointer-l6lrt 1/1 Running 0
## Going Further
Learn about [version pinning](concepts.md#versioning), [maintenance](topics/maintenance.md), and [addons](addons/overview.md).
Learn about [maintenance](topics/maintenance.md) and [addons](addons/overview.md).
!!! note
On Container Linux clusters, install the `container-linux-update-operator` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
On Container Linux clusters, install the `CLUO` addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
## Variables
@ -227,8 +226,7 @@ Learn about [version pinning](concepts.md#versioning), [maintenance](topics/main
| region | Google Cloud region | "us-central1" |
| dns_zone | Google Cloud DNS zone | "google-cloud.example.com" |
| dns_zone_name | Google Cloud DNS zone name | "example-zone" |
| ssh_authorized_key | SSH public key for ~/.ssh_authorized_keys | "ssh-rsa AAAAB3NZ..." |
| os_image | OS image for compute instances | "coreos-stable-1576-5-0-v20180105" |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." |
| asset_dir | Path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/yavin" |
Check the list of valid [regions](https://cloud.google.com/compute/docs/regions-zones/regions-zones) and list Container Linux [images](https://cloud.google.com/compute/docs/images) with `gcloud compute images list | grep coreos`.
@ -254,13 +252,18 @@ resource "google_dns_managed_zone" "zone-for-clusters" {
| Name | Description | Default | Example |
|:-----|:------------|:--------|:--------|
| machine_type | Machine type for compute instances | "n1-standard-1" | See below |
| controller_count | Number of controllers (i.e. masters) | 1 | 1 |
| worker_count | Number of workers | 1 | 3 |
| worker_preemptible | If enabled, Compute Engine will terminate controllers randomly within 24 hours | false | true |
| controller_type | Machine type for controllers | "n1-standard-1" | See below |
| worker_type | Machine type for workers | "n1-standard-1" | See below |
| os_image | Container Linux image for compute instances | "coreos-stable" | "coreos-stable-1632-3-0-v20180215" |
| disk_size | Size of the disk in GB | 40 | 100 |
| worker_preemptible | If enabled, Compute Engine will terminate workers randomly within 24 hours | false | true |
| controller_clc_snippets | Controller Container Linux Config snippets | [] | |
| worker_clc_snippets | Worker Container Linux Config snippets | [] | |
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
| pod_cidr | CIDR range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by kube-dns. | "cluster.local" | "k8s.example.com" |
Check the list of valid [machine types](https://cloud.google.com/compute/docs/machine-types).

View File

@ -11,10 +11,11 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.9.3 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Kubernetes v1.10.0 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Ready for Ingress, Dashboards, Metrics and other optional [addons](addons/overview.md)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/) and [preemption](https://typhoon.psdn.io/google-cloud/#preemption) (varies by platform)
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
* Provided via Terraform Modules
## Modules
@ -23,7 +24,7 @@ Typhoon provides a Terraform Module for each supported operating system and plat
| Platform | Operating System | Terraform Module | Status |
|---------------|------------------|------------------|--------|
| AWS | Container Linux | [aws/container-linux/kubernetes](aws.md) | beta |
| AWS | Container Linux | [aws/container-linux/kubernetes](aws.md) | stable |
| Bare-Metal | Container Linux | [bare-metal/container-linux/kubernetes](bare-metal.md) | stable |
| Digital Ocean | Container Linux | [digital-ocean/container-linux/kubernetes](digital-ocean.md) | beta |
| Google Cloud | Container Linux | [google-cloud/container-linux/kubernetes](google-cloud.md) | beta |
@ -43,29 +44,28 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo
```tf
module "google-cloud-yavin" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes"
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.10.0"
providers = {
google = "google.default"
local = "local.default"
null = "null.default"
google = "google.default"
local = "local.default"
null = "null.default"
template = "template.default"
tls = "tls.default"
tls = "tls.default"
}
# Google Cloud
cluster_name = "yavin"
region = "us-central1"
dns_zone = "example.com"
dns_zone_name = "example-zone"
os_image = "coreos-stable-1576-5-0-v20180105"
cluster_name = "yavin"
controller_count = 1
worker_count = 2
# configuration
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
# output assets dir
asset_dir = "/home/user/.secrets/clusters/yavin"
asset_dir = "/home/user/.secrets/clusters/yavin"
# optional
worker_count = 2
}
```
@ -85,9 +85,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
yavin-controller-0.c.example-com.internal Ready 6m v1.9.3
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.9.3
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.9.3
yavin-controller-0.c.example-com.internal Ready 6m v1.10.0
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.10.0
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.10.0
```
List the pods.
@ -114,11 +114,11 @@ kube-system pod-checkpointer-l6lrt 1/1 Running 0
Ask questions on the IRC #typhoon channel on [freenode.net](http://freenode.net/).
## Background
## Motivation
Typhoon powers the author's cloud and colocation clusters. The project has evolved through operational experience and Kubernetes changes. Typhoon is shared under a free license to allow others to use the work freely and contribute to its upkeep.
Typhoon addresses real world needs, which you may share. It is honest about limitations or areas that aren't mature yet. It avoids buzzword bingo and hype. It does not aim to be the one-solution-fits-all distro. An ecosystem of free (or enterprise) Kubernetes distros is healthy.
Typhoon addresses real world needs, which you may share. It is honest about limitations or areas that aren't mature yet. It avoids buzzword bingo and hype. It does not aim to be the one-solution-fits-all distro. An ecosystem of Kubernetes distributions is healthy.
## Social Contract
@ -126,4 +126,7 @@ Typhoon is not a product, trial, or free-tier. It is not run by a company, does
Typhoon clusters will contain only [free](https://www.debian.org/intro/free) components. Cluster components will not collect data on users without their permission.
*Disclosure: The author works for Red Hat (prev CoreOS), but Typhoon is unassociated and maintained independently.*
## Donations
Typhoon does not accept money donations. Instead, we encourage you to donate to one of [these organizations](https://github.com/poseidon/typhoon/wiki/Donations) to show your appreciation.

View File

@ -18,7 +18,7 @@ module "google-cloud-yavin" {
}
module "bare-metal-mercury" {
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.9.3"
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.10.0"
...
}
```
@ -202,3 +202,37 @@ Re-run `terraform plan`. Plan will claim there are no changes to apply. Run `ter
### Verify
You should now be able to run `terraform plan` without errors. When you choose, you may comment or delete a module from Terraform configs and `terraform apply` should destroy the cluster correctly.
## terraform-provider-ct v0.2.1
Typhoon requires updating the [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) plugin installed on your system from v0.2.0 to [v0.2.1](https://github.com/coreos/terraform-provider-ct/releases/tag/v0.2.1).
Check your `~/.terraformrc` to find your current `terraform-provider-ct` plugin.
```
providers {
ct = "/usr/local/bin/terraform-provider-ct"
}
```
Make a backup copy. Install `terraform-provider-ct` v0.2.1.
```sh
wget https://github.com/coreos/terraform-provider-ct/releases/download/v0.2.1/terraform-provider-ct-v0.2.1-linux-amd64.tar.gz
tar xzf terraform-provider-ct-v0.2.1-linux-amd64.tar.gz
sudo mv terraform-provider-ct-v0.2.1-linux-amd64/terraform-provider-ct /usr/local/bin/
```
Re-initialize Terraform configs which have Typhoon cluster resources.
```
cd clusters
terraform init
```
Verify Terraform does not produce a diff related to Container Linux provisioning.
```
terraform plan
```

View File

@ -11,10 +11,10 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.9.3 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Kubernetes v1.10.0 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Ready for Ingress, Dashboards, Metrics, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
## Docs

View File

@ -1,6 +1,6 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=203b90169ead2380f74cc64ea1f02c109806c9bc"
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=5f3546b66ffb9946b36e612537bb6a1830ae7746"
cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]

View File

@ -1,45 +1,44 @@
module "controllers" {
source = "controllers"
cluster_name = "${var.cluster_name}"
ssh_authorized_key = "${var.ssh_authorized_key}"
source = "controllers"
cluster_name = "${var.cluster_name}"
# GCE
network = "${google_compute_network.network.name}"
count = "${var.controller_count}"
region = "${var.region}"
network = "${google_compute_network.network.name}"
dns_zone = "${var.dns_zone}"
dns_zone_name = "${var.dns_zone_name}"
machine_type = "${var.machine_type}"
count = "${var.controller_count}"
machine_type = "${var.controller_type}"
os_image = "${var.os_image}"
disk_size = "${var.disk_size}"
# configuration
networking = "${var.networking}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
kubeconfig_ca_cert = "${module.bootkube.ca_cert}"
kubeconfig_kubelet_cert = "${module.bootkube.kubelet_cert}"
kubeconfig_kubelet_key = "${module.bootkube.kubelet_key}"
kubeconfig_server = "${module.bootkube.server}"
networking = "${var.networking}"
kubeconfig = "${module.bootkube.kubeconfig}"
ssh_authorized_key = "${var.ssh_authorized_key}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
clc_snippets = "${var.controller_clc_snippets}"
}
module "workers" {
source = "workers"
cluster_name = "${var.cluster_name}"
ssh_authorized_key = "${var.ssh_authorized_key}"
source = "workers"
name = "${var.cluster_name}"
cluster_name = "${var.cluster_name}"
# GCE
network = "${google_compute_network.network.name}"
region = "${var.region}"
network = "${google_compute_network.network.name}"
count = "${var.worker_count}"
machine_type = "${var.machine_type}"
machine_type = "${var.worker_type}"
os_image = "${var.os_image}"
disk_size = "${var.disk_size}"
preemptible = "${var.worker_preemptible}"
# configuration
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
kubeconfig_ca_cert = "${module.bootkube.ca_cert}"
kubeconfig_kubelet_cert = "${module.bootkube.kubelet_cert}"
kubeconfig_kubelet_key = "${module.bootkube.kubelet_key}"
kubeconfig_server = "${module.bootkube.server}"
kubeconfig = "${module.bootkube.kubeconfig}"
ssh_authorized_key = "${var.ssh_authorized_key}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
clc_snippets = "${var.worker_clc_snippets}"
}

View File

@ -7,7 +7,7 @@ systemd:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.2.15"
Environment="ETCD_IMAGE_TAG=v3.3.2"
Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
@ -67,6 +67,7 @@ systemd:
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
@ -81,8 +82,10 @@ systemd:
--lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni \
--node-labels=node-role.kubernetes.io/master \
--node-labels=node-role.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
RestartSec=10
@ -108,29 +111,14 @@ storage:
mode: 0644
contents:
inline: |
apiVersion: v1
kind: Config
clusters:
- name: local
cluster:
server: ${kubeconfig_server}
certificate-authority-data: ${kubeconfig_ca_cert}
users:
- name: kubelet
user:
client-certificate-data: ${kubeconfig_kubelet_cert}
client-key-data: ${kubeconfig_kubelet_key}
contexts:
- context:
cluster: local
user: kubelet
${kubeconfig}
- path: /etc/kubernetes/kubelet.env
filesystem: root
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.3
KUBELET_IMAGE_TAG=v1.10.0
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -151,7 +139,7 @@ storage:
# Move experimental manifests
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.10.0}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.11.0}"
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
exec /usr/bin/rkt run \
--trust-keys-from-https \

View File

@ -65,13 +65,10 @@ data "template_file" "controller_config" {
# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...
etcd_initial_cluster = "${join(",", formatlist("%s=https://%s:2380", null_resource.repeat.*.triggers.name, null_resource.repeat.*.triggers.domain))}"
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
ssh_authorized_key = "${var.ssh_authorized_key}"
kubeconfig_ca_cert = "${var.kubeconfig_ca_cert}"
kubeconfig_kubelet_cert = "${var.kubeconfig_kubelet_cert}"
kubeconfig_kubelet_key = "${var.kubeconfig_kubelet_key}"
kubeconfig_server = "${var.kubeconfig_server}"
kubeconfig = "${indent(10, var.kubeconfig)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
}
}
@ -90,4 +87,5 @@ data "ct_config" "controller_ign" {
count = "${var.count}"
content = "${element(data.template_file.controller_config.*.rendered, count.index)}"
pretty_print = false
snippets = ["${var.clc_snippets}"]
}

View File

@ -3,9 +3,9 @@ variable "cluster_name" {
description = "Unique cluster name"
}
variable "ssh_authorized_key" {
variable "region" {
type = "string"
description = "SSH public key for logging in as user 'core'"
description = "Google Cloud region (e.g. us-central1, see `gcloud compute regions list`)."
}
variable "network" {
@ -30,11 +30,6 @@ variable "count" {
description = "Number of controller compute instances the instance group should manage"
}
variable "region" {
type = "string"
description = "Google Cloud region (e.g. us-central1, see `gcloud compute regions list`)."
}
variable "machine_type" {
type = "string"
description = "Machine type for compute instances (e.g. gcloud compute machine-types list)"
@ -48,20 +43,30 @@ variable "os_image" {
variable "disk_size" {
type = "string"
default = "40"
description = "The size of the disk in gigabytes."
description = "Size of the disk in GB"
}
// configuration
# configuration
variable "networking" {
description = "Choice of networking provider (flannel or calico)"
type = "string"
default = "flannel"
default = "calico"
}
variable "kubeconfig" {
type = "string"
description = "Generated Kubelet kubeconfig"
}
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key for logging in as user 'core'"
}
variable "service_cidr" {
description = <<EOD
CIDR IP range to assign Kubernetes services.
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
EOD
@ -75,24 +80,8 @@ variable "cluster_domain_suffix" {
default = "cluster.local"
}
// kubeconfig
variable "kubeconfig_ca_cert" {
type = "string"
description = "Generated kubeconfig CA certificate"
}
variable "kubeconfig_kubelet_cert" {
type = "string"
description = "Generated kubeconfig kubelet certificate"
}
variable "kubeconfig_kubelet_key" {
type = "string"
description = "Generated kubeconfig kubelet private key"
}
variable "kubeconfig_server" {
type = "string"
description = "Generated kubeconfig server"
variable "clc_snippets" {
type = "list"
description = "Container Linux Config snippets"
default = []
}

View File

@ -13,3 +13,7 @@ output "network_name" {
output "network_self_link" {
value = "${google_compute_network.network.self_link}"
}
output "kubeconfig" {
value = "${module.bootkube.kubeconfig}"
}

View File

@ -5,7 +5,7 @@ terraform {
}
provider "google" {
version = "~> 1.2"
version = "~> 1.6"
}
provider "local" {

View File

@ -1,7 +1,6 @@
# Secure copy etcd TLS assets and kubeconfig to controllers. Activates kubelet.service
resource "null_resource" "copy-secrets" {
depends_on = ["module.bootkube"]
count = "${var.controller_count}"
# Secure copy etcd TLS assets to controllers.
resource "null_resource" "copy-controller-secrets" {
count = "${var.controller_count}"
connection {
type = "ssh"
@ -10,11 +9,6 @@ resource "null_resource" "copy-secrets" {
timeout = "15m"
}
provisioner "file" {
content = "${module.bootkube.kubeconfig}"
destination = "$HOME/kubeconfig"
}
provisioner "file" {
content = "${module.bootkube.etcd_ca_cert}"
destination = "$HOME/etcd-client-ca.crt"
@ -62,7 +56,6 @@ resource "null_resource" "copy-secrets" {
"sudo mv etcd-peer.key /etc/ssl/etcd/etcd/peer.key",
"sudo chown -R etcd:etcd /etc/ssl/etcd",
"sudo chmod -R 500 /etc/ssl/etcd",
"sudo mv /home/core/kubeconfig /etc/kubernetes/kubeconfig",
]
}
}
@ -70,7 +63,12 @@ resource "null_resource" "copy-secrets" {
# Secure copy bootkube assets to ONE controller and start bootkube to perform
# one-time self-hosted cluster bootstrapping.
resource "null_resource" "bootkube-start" {
depends_on = ["module.controllers", "module.bootkube", "module.workers", "null_resource.copy-secrets"]
depends_on = [
"module.bootkube",
"module.controllers",
"module.workers",
"null_resource.copy-controller-secrets",
]
connection {
type = "ssh"
@ -86,7 +84,7 @@ resource "null_resource" "bootkube-start" {
provisioner "remote-exec" {
inline = [
"sudo mv /home/core/assets /opt/bootkube",
"sudo mv $HOME/assets /opt/bootkube",
"sudo systemctl start bootkube",
]
}

View File

@ -1,8 +1,10 @@
variable "cluster_name" {
type = "string"
description = "Cluster name"
description = "Unique cluster name (prepended to dns_zone)"
}
# Google Cloud
variable "region" {
type = "string"
description = "Google Cloud Region (e.g. us-central1, see `gcloud compute regions list`)"
@ -10,34 +12,20 @@ variable "region" {
variable "dns_zone" {
type = "string"
description = "Google Cloud DNS Zone (e.g. google-cloud.dghubble.io)"
description = "Google Cloud DNS Zone (e.g. google-cloud.example.com)"
}
variable "dns_zone_name" {
type = "string"
description = "Google Cloud DNS Zone name (e.g. google-cloud-prod-zone)"
description = "Google Cloud DNS Zone name (e.g. example-zone)"
}
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key for user 'core'"
}
variable "machine_type" {
type = "string"
default = "n1-standard-1"
description = "Machine type for compute instances (see `gcloud compute machine-types list`)"
}
variable "os_image" {
type = "string"
description = "OS image from which to initialize the disk (see `gcloud compute images list`)"
}
# instances
variable "controller_count" {
type = "string"
default = "1"
description = "Number of controllers"
description = "Number of controllers (i.e. masters)"
}
variable "worker_count" {
@ -46,13 +34,54 @@ variable "worker_count" {
description = "Number of workers"
}
variable controller_type {
type = "string"
default = "n1-standard-1"
description = "Machine type for controllers (see `gcloud compute machine-types list`)"
}
variable worker_type {
type = "string"
default = "n1-standard-1"
description = "Machine type for controllers (see `gcloud compute machine-types list`)"
}
variable "os_image" {
type = "string"
default = "coreos-stable"
description = "Container Linux image for compute instances (e.g. coreos-stable)"
}
variable "disk_size" {
type = "string"
default = "40"
description = "Size of the disk in GB"
}
variable "worker_preemptible" {
type = "string"
default = "false"
description = "If enabled, Compute Engine will terminate workers randomly within 24 hours"
}
# bootkube assets
variable "controller_clc_snippets" {
type = "list"
description = "Controller Container Linux Config snippets"
default = []
}
variable "worker_clc_snippets" {
type = "list"
description = "Worker Container Linux Config snippets"
default = []
}
# configuration
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key for user 'core'"
}
variable "asset_dir" {
description = "Path to a directory where generated assets should be placed (contains secrets)"
@ -66,14 +95,14 @@ variable "networking" {
}
variable "pod_cidr" {
description = "CIDR IP range to assign Kubernetes pods"
description = "CIDR IPv4 range to assign Kubernetes pods"
type = "string"
default = "10.2.0.0/16"
}
variable "service_cidr" {
description = <<EOD
CIDR IP range to assign Kubernetes services.
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
EOD

View File

@ -43,6 +43,7 @@ systemd:
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
@ -57,7 +58,8 @@ systemd:
--lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni \
--node-labels=node-role.kubernetes.io/node \
--pod-manifest-path=/etc/kubernetes/manifests
--pod-manifest-path=/etc/kubernetes/manifests \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
RestartSec=5
@ -82,29 +84,14 @@ storage:
mode: 0644
contents:
inline: |
apiVersion: v1
kind: Config
clusters:
- name: local
cluster:
server: ${kubeconfig_server}
certificate-authority-data: ${kubeconfig_ca_cert}
users:
- name: kubelet
user:
client-certificate-data: ${kubeconfig_kubelet_cert}
client-key-data: ${kubeconfig_kubelet_key}
contexts:
- context:
cluster: local
user: kubelet
${kubeconfig}
- path: /etc/kubernetes/kubelet.env
filesystem: root
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.3
KUBELET_IMAGE_TAG=v1.10.0
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -122,7 +109,7 @@ storage:
--volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \
--insecure-options=image \
docker://gcr.io/google_containers/hyperkube:v1.9.3 \
docker://gcr.io/google_containers/hyperkube:v1.10.0 \
--net=host \
--dns=host \
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)

View File

@ -1,18 +1,18 @@
# Static IPv4 address for the Network Load Balancer
resource "google_compute_address" "ingress-ip" {
name = "${var.cluster_name}-ingress-ip"
name = "${var.name}-ingress-ip"
}
# Network Load Balancer (i.e. forwarding rules)
resource "google_compute_forwarding_rule" "worker-http-lb" {
name = "${var.cluster_name}-worker-http-rule"
name = "${var.name}-worker-http-rule"
ip_address = "${google_compute_address.ingress-ip.address}"
port_range = "80"
target = "${google_compute_target_pool.workers.self_link}"
}
resource "google_compute_forwarding_rule" "worker-https-lb" {
name = "${var.cluster_name}-worker-https-rule"
name = "${var.name}-worker-https-rule"
ip_address = "${google_compute_address.ingress-ip.address}"
port_range = "443"
target = "${google_compute_target_pool.workers.self_link}"
@ -20,7 +20,7 @@ resource "google_compute_forwarding_rule" "worker-https-lb" {
# Network Load Balancer target pool of instances.
resource "google_compute_target_pool" "workers" {
name = "${var.cluster_name}-worker-pool"
name = "${var.name}-worker-pool"
health_checks = [
"${google_compute_http_health_check.ingress.name}",
@ -31,7 +31,7 @@ resource "google_compute_target_pool" "workers" {
# Ingress HTTP Health Check
resource "google_compute_http_health_check" "ingress" {
name = "${var.cluster_name}-ingress-health"
name = "${var.name}-ingress-health"
description = "Health check Ingress controller health host port"
timeout_sec = 5

View File

@ -1,44 +1,49 @@
variable "cluster_name" {
variable "name" {
type = "string"
description = "Unique cluster name"
description = "Unique name for the worker pool"
}
variable "ssh_authorized_key" {
variable "cluster_name" {
type = "string"
description = "SSH public key for logging in as user 'core'"
description = "Must be set to `cluster_name of cluster`"
}
# Google Cloud
variable "region" {
type = "string"
description = "Must be set to `region` of cluster"
}
variable "network" {
type = "string"
description = "Name of the network to attach to the compute instance interfaces"
description = "Must be set to `network_name` output by cluster"
}
# instances
variable "count" {
type = "string"
default = "1"
description = "Number of worker compute instances the instance group should manage"
}
variable "region" {
type = "string"
description = "Google Cloud region (e.g. us-central1, see `gcloud compute regions list`)."
}
variable "machine_type" {
type = "string"
default = "n1-standard-1"
description = "Machine type for compute instances (e.g. gcloud compute machine-types list)"
}
variable "os_image" {
type = "string"
description = "OS image from which to initialize the disk (e.g. gcloud compute images list)"
default = "coreos-stable"
description = "Container Linux image for compute instanges (e.g. gcloud compute images list)"
}
variable "disk_size" {
type = "string"
default = "40"
description = "The size of the disk in gigabytes."
description = "Size of the disk in GB"
}
variable "preemptible" {
@ -49,9 +54,19 @@ variable "preemptible" {
# configuration
variable "kubeconfig" {
type = "string"
description = "Must be set to `kubeconfig` output by cluster"
}
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key for user 'core'"
}
variable "service_cidr" {
description = <<EOD
CIDR IP range to assign Kubernetes services.
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
EOD
@ -65,24 +80,22 @@ variable "cluster_domain_suffix" {
default = "cluster.local"
}
# kubeconfig
variable "kubeconfig_ca_cert" {
type = "string"
description = "Generated kubeconfig CA certificate"
variable "clc_snippets" {
type = "list"
description = "Container Linux Config snippets"
default = []
}
variable "kubeconfig_kubelet_cert" {
# unofficial, undocumented, unsupported, temporary
variable "accelerator_type" {
type = "string"
description = "Generated kubeconfig kubelet certificate"
default = ""
description = "Google Compute Engine accelerator type (e.g. nvidia-tesla-k80, see gcloud compute accelerator-types list)"
}
variable "kubeconfig_kubelet_key" {
variable "accelerator_count" {
type = "string"
description = "Generated kubeconfig kubelet private key"
}
variable "kubeconfig_server" {
type = "string"
description = "Generated kubeconfig server"
default = "0"
description = "Number of compute engine accelerators"
}

View File

@ -1,11 +1,11 @@
# Regional managed instance group maintains a homogeneous set of workers that
# span the zones in the region.
resource "google_compute_region_instance_group_manager" "workers" {
name = "${var.cluster_name}-worker-group"
description = "Compute instance group of ${var.cluster_name} workers"
name = "${var.name}-worker-group"
description = "Compute instance group of ${var.name} workers"
# instance name prefix for instances in the group
base_instance_name = "${var.cluster_name}-worker"
base_instance_name = "${var.name}-worker"
instance_template = "${google_compute_instance_template.worker.self_link}"
region = "${var.region}"
@ -22,24 +22,21 @@ data "template_file" "worker_config" {
template = "${file("${path.module}/cl/worker.yaml.tmpl")}"
vars = {
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
k8s_etcd_service_ip = "${cidrhost(var.service_cidr, 15)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
ssh_authorized_key = "${var.ssh_authorized_key}"
kubeconfig_ca_cert = "${var.kubeconfig_ca_cert}"
kubeconfig_kubelet_cert = "${var.kubeconfig_kubelet_cert}"
kubeconfig_kubelet_key = "${var.kubeconfig_kubelet_key}"
kubeconfig_server = "${var.kubeconfig_server}"
kubeconfig = "${indent(10, var.kubeconfig)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
}
}
data "ct_config" "worker_ign" {
content = "${data.template_file.worker_config.rendered}"
pretty_print = false
snippets = ["${var.clc_snippets}"]
}
resource "google_compute_instance_template" "worker" {
name_prefix = "${var.cluster_name}-worker-"
name_prefix = "${var.name}-worker-"
description = "Worker Instance template"
machine_type = "${var.machine_type}"
@ -67,8 +64,12 @@ resource "google_compute_instance_template" "worker" {
}
can_ip_forward = true
tags = ["worker", "${var.cluster_name}-worker", "${var.name}-worker"]
tags = ["worker", "${var.cluster_name}-worker"]
guest_accelerator {
count = "${var.accelerator_count}"
type = "${var.accelerator_type}"
}
lifecycle {
# To update an Instance Template, Terraform should replace the existing resource

View File

@ -58,4 +58,6 @@ pages:
- 'Performance': 'topics/performance.md'
- 'FAQ': 'faq.md'
- 'Advanced':
- 'Overview': 'advanced/overview.md'
- 'Customization': 'advanced/customization.md'
- 'Worker Pools': 'advanced/worker-pools.md'