Compare commits

..

31 Commits

Author SHA1 Message Date
77c0a4cf2e Update Kubernetes from v1.10.0 to v1.10.1
* Use kubernetes-incubator/bootkube v0.12.0
2018-04-12 20:57:31 -07:00
5035d56db2 Refactor GCP to remove controller internal module
* Remove the controller internal module to align with
other platforms and since its not a supported use case
2018-04-12 19:41:51 -07:00
9bb3de5327 Skip creating unused dirs on worker nodes 2018-04-11 22:23:51 -07:00
c8eabc2af4 Fix GCP controller_type and worker_type vars 2018-04-11 22:19:58 -07:00
2eaf858c5c Update example BGPPeer manifest
Previous example may have been outdated. It resulted in `error: unable to recognize "example.yaml": no matches for /, Kind=bgpPeer` .

See https://docs.projectcalico.org/v3.0/reference/calicoctl/resources/bgppeer.
2018-04-09 23:23:18 -05:00
b8656fd74b Clarify bare-metal SSH instructions 2018-04-08 14:11:05 -07:00
d276fffcda Fix bare-metal multiple apply/ssh on Terraform v0.11.4+
* Terraform v0.11.4 introduced changes to remote-exec
that mean Typhoon bare-metal clusters require multiple
runs of terraform apply to ssh and bootstrap.
* Bare-metal installs PXE boot a live instance to install
to disk and then reboot from disk as controllers/workers.
Terraform remote-exec has no way to "know" to wait until
the reboot has occurred to kickoff Kubernetes bootstrap.
Previously Typhoon created a "debug" user during this
install phase to allow an admin to SSH, but remote-exec
would hang, trying to connect as user "core". Terraform
v0.11.4 changes this behavior so remote-exec fails and
a user must re-run terraform apply until succeeding.
* A new way to "trick" remote-exec into waiting for the
reboot into the disk install is to run SSH on a non-standard
port during the disk install. This retains the ability
for an admin to SSH during install (most distros don't have
this) and fixes the issue so only a single run of terraform
apply is needed.
* https://github.com/hashicorp/terraform/pull/17359#issuecomment-376415464
2018-04-08 13:32:31 -07:00
6b08bde479 Use k8s.gcr.io instead of gcr.io/google_containers
* Kubernetes recommends using the alias to fetch images
from the nearest GCR regional mirror, to abstract the use
of GCR, and to drop names containing 'google'
* https://groups.google.com/forum/#!msg/kubernetes-dev/ytjk_rNrTa0/3EFUHvovCAAJ
2018-04-08 12:57:52 -07:00
f4b2396718 Return Prometheus deployment to be a worker workload
* Expose etcd metrics to workers so Prometheus can
run on a worker, rather than a controller
* Drop temporary firewall rules allowing Prometheus
to run on a controller and scrape targes
* Related to https://github.com/poseidon/typhoon/pull/175
2018-04-08 12:20:00 -07:00
b76126db93 Update docs builder and material theme 2018-04-08 00:00:03 -07:00
7186aa46da Update kube-state-metrics from v1.2.0 to v1.3.0
* https://github.com/kubernetes/kube-state-metrics/pull/412
* https://github.com/kubernetes/kube-state-metrics/pull/413
2018-04-04 21:04:13 -07:00
18dbaf74ce Update kube-dns from v1.14.8 to v1.14.9
* https://github.com/kubernetes/kubernetes/pull/61908
2018-04-04 21:00:23 -07:00
ce001e9d56 Update etcd from v3.3.2 to v3.3.3
* https://github.com/coreos/etcd/releases/tag/v3.3.3
2018-04-04 20:32:24 -07:00
d770393dbc Add etcd metrics, Prometheus scrapes, and Grafana dash
* Use etcd v3.3 --listen-metrics-urls to expose only metrics
data via http://0.0.0.0:2381 on controllers
* Add Prometheus discovery for etcd peers on controller nodes
* Temporarily drop two noisy Prometheus alerts
2018-04-03 20:31:00 -07:00
642f7ec22f Update CHANGES.md with Kubernetes link 2018-03-30 23:12:38 -07:00
1cc043d1eb Update Kubernetes from v1.9.6 to v1.10.0 2018-03-30 22:14:07 -07:00
f8e9bfb1c0 Add disk_type variable for EBS volume type on AWS
* Change EBS volume type from `standard` ("prior generation)
 to `gp2`. Prometheus alerts are tuned for SSDs
* Other platforms have fast enough disks by default
2018-03-29 22:51:54 -07:00
b1e41dcb99 addons: Update from Grafana v4.6.3 to v5.0.4
This reverts commit c59a9c66b1.
2018-03-28 19:45:19 -07:00
de4d90750e Use consistent naming of remote provision steps 2018-03-26 00:29:57 -07:00
7acd4931f6 Remove redundant kubeconfig copy on AWS and GCP
* AWS and Google Cloud make use of auto-scaling groups
and managed instance groups, respectively. As such, the
kubeconfig is already held in cloud user-data
* Controller instances are provisioned with a kubeconfig
from user-data. Its redundant to use a Terraform remote
file copy step for the kubeconfig.
2018-03-26 00:01:47 -07:00
cfd603bea2 Ensure etcd secrets are only distributed to controller hosts
* Previously, etcd secrets were erroneously distributed to worker
nodes (permissions 500, ownership etc:etcd).
2018-03-25 23:46:44 -07:00
fdb543e834 Add optional controller_type and worker_type vars on GCP
* Remove optional machine_type variable on Google Cloud
* Use controller_type and worker_type instead
2018-03-25 22:11:18 -07:00
8d3d4220fd Add disk_size variable on Google Cloud 2018-03-25 22:04:14 -07:00
ba9daf439e Remove unmaintained pxe-worker internal module 2018-03-25 21:57:52 -07:00
38adb14bd2 Remove optional variable networking on Digital Ocean
* Calico isn't viable on Digital Ocean because their firewalls
do not support IP-IP protocol. Its not viable to run a cluster
without firewalls just to use Calico.
* Remove the caveat note. Don't allow users to shoot themselves
in the foot
2018-03-25 21:48:51 -07:00
e43cf9f608 Organize and cleanup variable descriptions 2018-03-25 21:44:43 -07:00
455a4af27e Improve cluster definition examples in docs 2018-03-25 20:41:52 -07:00
39876e455f Fix docs to reflect enforced provider versions 2018-03-25 11:34:39 -07:00
da2be86e8c Add v1.9.6 heading to CHANGES.md 2018-03-22 22:01:29 -07:00
65a2751f77 addons: Update heapster from v1.5.1 to v1.5.2
* https://github.com/kubernetes/heapster/releases/tag/v1.5.2
2018-03-21 20:32:01 -07:00
a04ef3919a Update Kubernetes from v1.9.5 to v1.9.6 2018-03-21 20:29:52 -07:00
68 changed files with 721 additions and 1045 deletions

View File

@ -4,6 +4,67 @@ Notable changes between versions.
## Latest
* Kubernetes [v1.10.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.10.md#v1101)
* Enable etcd v3.3 metrics endpoint ([#175](https://github.com/poseidon/typhoon/pull/175))
* Use `k8s.gcr.io` instead of `gcr.io/google_containers` ([#180](https://github.com/poseidon/typhoon/pull/180))
* Kubernetes [recommends](https://groups.google.com/forum/#!msg/kubernetes-dev/ytjk_rNrTa0/3EFUHvovCAAJ) using the alias to pull from the nearest regional mirror and to abstract the backing container registry
* Update kube-dns from v1.14.8 to v1.14.9
* Update etcd from v3.3.2 to v3.3.3
* Use kubernetes-incubator/bootkube v0.12.0
#### Bare-Metal
* Fix need for multiple `terraform apply` runs to create a cluster with Terraform v0.11.4 ([#181](https://github.com/poseidon/typhoon/pull/181))
* To SSH during a disk install for debugging, SSH as user "core" with port 2222
* Remove the old trick of using a user "debug" during disk install
#### Google Cloud
* Refactor out the `controller` internal module
#### Addons
* Add Prometheus discovery for etcd peers on controller nodes ([#175](https://github.com/poseidon/typhoon/pull/175))
* Scrape etcd v3.3 `--listen-metrics-urls` for metrics
* Enable etcd alerts and populate the etcd Grafana dashboard
* Update kube-state-metrics from v1.2.0 to v1.3.0
## v1.10.0
* Kubernetes [v1.10.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.10.md#v1100)
* Remove unused, unmaintained `pxe-worker` internal module
#### AWS
* Add `disk_type` optional variable for setting the EBS volume type ([#176](https://github.com/poseidon/typhoon/pull/176))
* Change default type from `standard` to `gp2`. Prometheus etcd alerts are tuned for fast disks.
#### Digital Ocean
* Ensure etcd secrets are only distributed to controller hosts, not workers.
* Remove `networking` optional variable. Only flannel works on Digital Ocean.
#### Google Cloud
* Add `disk_size` optional variable for setting instance disk size in GB
* Add `controller_type` optional variable for setting machine type for controllers
* Add `worker_type` optional variable for setting machine type for workers
* Remove `machine_type` optional variable. Use `controller_type` and `worker_type`.
#### Addons
* Update Grafana from v4.6.3 to v5.0.4 ([#153](https://github.com/poseidon/typhoon/pull/153), [#174](https://github.com/poseidon/typhoon/pull/174))
* Restrict dashboard organization role to Viewer
## v1.9.6
* Kubernetes [v1.9.6](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.9.md#v196)
* Update Calico from v3.0.3 to v3.0.4
#### Addons
* Update heapster from v1.5.1 to v1.5.2
## v1.9.5
* Kubernetes [v1.9.5](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.9.md#v195)
@ -43,7 +104,7 @@ Notable changes between versions.
* Allow flexvolume plugins to be used on any Typhoon cluster (not just bare-metal)
* Upgrade etcd from v3.2.15 to v3.3.2
* Update Calico from v3.0.2 to v3.0.3
* Use kubernetes-incubator/bootkube v0.10.0
* Use kubernetes-incubator/bootkube v0.11.0
* [Recommend](https://typhoon.psdn.io/topics/maintenance/#terraform-provider-ct-v021) updating `terraform-provider-ct` plugin from v0.2.0 to [v0.2.1](https://github.com/coreos/terraform-provider-ct/releases/tag/v0.2.1) (action recommended)
#### AWS

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.9.5 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Kubernetes v1.10.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/) and [preemption](https://typhoon.psdn.io/google-cloud/#preemption) (varies by platform)
@ -44,29 +44,28 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo
```tf
module "google-cloud-yavin" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.9.5"
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.10.1"
providers = {
google = "google.default"
local = "local.default"
null = "null.default"
google = "google.default"
local = "local.default"
null = "null.default"
template = "template.default"
tls = "tls.default"
tls = "tls.default"
}
# Google Cloud
cluster_name = "yavin"
region = "us-central1"
dns_zone = "example.com"
dns_zone_name = "example-zone"
os_image = "coreos-stable"
cluster_name = "yavin"
controller_count = 1
worker_count = 2
# configuration
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
# output assets dir
asset_dir = "/home/user/.secrets/clusters/yavin"
asset_dir = "/home/user/.secrets/clusters/yavin"
# optional
worker_count = 2
}
```
@ -87,9 +86,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
yavin-controller-0.c.example-com.internal Ready 6m v1.9.5
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.9.5
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.9.5
yavin-controller-0.c.example-com.internal Ready 6m v1.10.1
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.10.1
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.10.1
```
List the pods.

View File

@ -0,0 +1,15 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-dashboard-providers
namespace: monitoring
data:
dashboard-providers.yaml: |+
apiVersion: 1
providers:
- name: 'default'
ordId: 1
folder: ''
type: file
options:
path: /var/lib/grafana/dashboards

View File

@ -5,14 +5,12 @@ metadata:
namespace: monitoring
data:
deployment-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -39,7 +37,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -110,7 +108,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -181,7 +179,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "Bps",
"gauge": {
@ -262,7 +260,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -333,7 +331,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -403,7 +401,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -473,7 +471,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -550,7 +548,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -665,7 +663,7 @@ data:
{
"allValue": ".*",
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": false,
"label": "Namespace",
@ -685,7 +683,7 @@ data:
{
"allValue": null,
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": false,
"label": "Deployment",
@ -737,24 +735,11 @@ data:
"title": "Deployment",
"version": 1
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
etcd-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"name": "DS_PROMETHEUS",
"name": "prometheus",
"label": "prometheus",
"description": "",
"type": "datasource",
@ -813,7 +798,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"format": "none",
@ -889,7 +874,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 0,
@ -978,7 +963,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 0,
@ -1079,7 +1064,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"decimals": null,
"editable": false,
"error": false,
@ -1161,7 +1146,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 0,
@ -1250,7 +1235,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 0,
@ -1342,7 +1327,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 5,
@ -1422,7 +1407,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 5,
@ -1502,7 +1487,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 0,
@ -1582,7 +1567,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"decimals": null,
"editable": false,
"error": false,
@ -1676,7 +1661,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 0,
@ -1782,7 +1767,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"decimals": 0,
"editable": false,
"error": false,
@ -1909,26 +1894,13 @@ data:
"title": "etcd",
"version": 4
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
kubernetes-capacity-planning-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -1954,7 +1926,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -2032,7 +2004,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -2134,7 +2106,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -2250,7 +2222,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -2333,7 +2305,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -2440,7 +2412,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percentunit",
"gauge": {
@ -2522,7 +2494,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -2604,7 +2576,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -2695,7 +2667,7 @@ data:
"aliasColors": {},
"bars": false,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -2782,7 +2754,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -2897,26 +2869,13 @@ data:
"title": "Kubernetes Capacity Planning",
"version": 4
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
kubernetes-cluster-health-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -2944,7 +2903,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3025,7 +2984,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3101,7 +3060,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3177,7 +3136,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3263,7 +3222,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3339,7 +3298,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3415,7 +3374,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3491,7 +3450,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3605,26 +3564,13 @@ data:
"title": "Kubernetes Cluster Health",
"version": 9
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
kubernetes-cluster-status-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -3651,7 +3597,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3723,7 +3669,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3805,7 +3751,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -3877,7 +3823,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -3949,7 +3895,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4021,7 +3967,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -4103,7 +4049,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4175,7 +4121,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4247,7 +4193,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4319,7 +4265,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4429,26 +4375,13 @@ data:
"title": "Kubernetes Cluster Status",
"version": 3
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
kubernetes-control-plane-status-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -4475,7 +4408,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4550,7 +4483,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4625,7 +4558,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4700,7 +4633,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4783,7 +4716,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -4869,7 +4802,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -4944,7 +4877,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -5069,26 +5002,13 @@ data:
"title": "Kubernetes Control Plane Status",
"version": 3
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
kubernetes-resource-requests-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -5113,7 +5033,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"description": "This represents the total [CPU resource requests](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#meaning-of-cpu) in the cluster.\nFor comparison the total [allocatable CPU cores](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node-allocatable.md) is also shown.",
"editable": false,
"error": false,
@ -5202,7 +5122,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -5284,7 +5204,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"description": "This represents the total [memory resource requests](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#meaning-of-memory) in the cluster.\nFor comparison the total [allocatable memory](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node-allocatable.md) is also shown.",
"editable": false,
"error": false,
@ -5373,7 +5293,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -5486,26 +5406,13 @@ data:
"title": "Kubernetes Resource Requests",
"version": 2
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
nodes-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -5532,7 +5439,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -5611,7 +5518,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -5713,7 +5620,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -5825,7 +5732,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -5907,7 +5814,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -6014,7 +5921,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percentunit",
"gauge": {
@ -6096,7 +6003,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -6178,7 +6085,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -6270,7 +6177,7 @@ data:
{
"allValue": null,
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": false,
"label": null,
@ -6322,26 +6229,13 @@ data:
"title": "Nodes",
"version": 2
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
pods-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -6366,7 +6260,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -6472,7 +6366,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -6576,7 +6470,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -6662,7 +6556,7 @@ data:
{
"allValue": ".*",
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": true,
"label": "Namespace",
@ -6682,7 +6576,7 @@ data:
{
"allValue": null,
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": false,
"label": "Pod",
@ -6702,7 +6596,7 @@ data:
{
"allValue": ".*",
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": true,
"label": "Container",
@ -6754,26 +6648,13 @@ data:
"title": "Pods",
"version": 1
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
statefulset-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -6800,7 +6681,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -6871,7 +6752,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -6942,7 +6823,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "Bps",
"gauge": {
@ -7023,7 +6904,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -7094,7 +6975,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -7164,7 +7045,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -7234,7 +7115,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -7311,7 +7192,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -7405,7 +7286,7 @@ data:
{
"allValue": ".*",
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": false,
"label": "Namespace",
@ -7425,7 +7306,7 @@ data:
{
"allValue": null,
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": false,
"label": "StatefulSet",
@ -7477,23 +7358,4 @@ data:
"title": "StatefulSet",
"version": 1
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
prometheus-datasource.json: |+
{
"access": "proxy",
"basicAuth": false,
"name": "prometheus",
"type": "prometheus",
"url": "http://prometheus.monitoring.svc"
}
---

View File

@ -0,0 +1,16 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-datasources
namespace: monitoring
data:
prometheus.yaml: |+
apiVersion: 1
datasources:
- name: prometheus
type: prometheus
access: proxy
orgId: 1
url: http://prometheus.monitoring.svc.cluster.local
version: 1
editable: false

View File

@ -21,7 +21,7 @@ spec:
spec:
containers:
- name: grafana
image: grafana/grafana:4.6.3
image: grafana/grafana:5.0.4
env:
- name: GF_SERVER_HTTP_PORT
value: "8080"
@ -30,7 +30,7 @@ spec:
- name: GF_AUTH_ANONYMOUS_ENABLED
value: "true"
- name: GF_AUTH_ANONYMOUS_ORG_ROLE
value: Admin
value: Viewer
ports:
- name: http
containerPort: 8080
@ -41,22 +41,20 @@ spec:
limits:
memory: 200Mi
cpu: 200m
- name: grafana-watcher
image: quay.io/coreos/grafana-watcher:v0.0.8
args:
- '--watch-dir=/etc/grafana/dashboards'
- '--grafana-url=http://localhost:8080'
resources:
requests:
memory: "16Mi"
cpu: "50m"
limits:
memory: "32Mi"
cpu: "100m"
volumeMounts:
- name: dashboards
mountPath: /etc/grafana/dashboards
- name: datasources
mountPath: /etc/grafana/provisioning/datasources
- name: dashboard-providers
mountPath: /etc/grafana/provisioning/dashboards
- name: dashboards
mountPath: /var/lib/grafana/dashboards
volumes:
- name: datasources
configMap:
name: grafana-datasources
- name: dashboard-providers
configMap:
name: grafana-dashboard-providers
- name: dashboards
configMap:
name: grafana-dashboards

View File

@ -18,7 +18,7 @@ spec:
serviceAccountName: heapster
containers:
- name: heapster
image: k8s.gcr.io/heapster-amd64:v1.5.1
image: k8s.gcr.io/heapster-amd64:v1.5.2
command:
- /heapster
- --source=kubernetes.summary_api:''

View File

@ -20,7 +20,7 @@ spec:
# Any image is permissable as long as:
# 1. It serves a 404 page at /
# 2. It serves 200 on a /healthz endpoint
image: gcr.io/google_containers/defaultbackend:1.4
image: k8s.gcr.io/defaultbackend:1.4
ports:
- containerPort: 8080
resources:

View File

@ -20,7 +20,7 @@ spec:
# Any image is permissable as long as:
# 1. It serves a 404 page at /
# 2. It serves 200 on a /healthz endpoint
image: gcr.io/google_containers/defaultbackend:1.4
image: k8s.gcr.io/defaultbackend:1.4
ports:
- containerPort: 8080
resources:

View File

@ -20,7 +20,7 @@ spec:
# Any image is permissable as long as:
# 1. It serves a 404 page at /
# 2. It serves 200 on a /healthz endpoint
image: gcr.io/google_containers/defaultbackend:1.4
image: k8s.gcr.io/defaultbackend:1.4
ports:
- containerPort: 8080
resources:

View File

@ -112,6 +112,22 @@ data:
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
# Scrap etcd metrics from controllers
- job_name: 'etcd'
kubernetes_sd_configs:
- role: node
scheme: http
relabel_configs:
- source_labels: [__meta_kubernetes_node_label_node_role_kubernetes_io_controller]
action: keep
regex: 'true'
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- source_labels: [__meta_kubernetes_node_name]
action: replace
target_label: __address__
replacement: '${1}:2381'
# Scrape config for service endpoints.
#
# The relabeling allows the actual service scrape endpoint to be configured

View File

@ -5,6 +5,8 @@ metadata:
rules:
- apiGroups: [""]
resources:
- configmaps
- secrets
- nodes
- pods
- services

View File

@ -22,7 +22,7 @@ spec:
serviceAccountName: kube-state-metrics
containers:
- name: kube-state-metrics
image: quay.io/coreos/kube-state-metrics:v1.2.0
image: quay.io/coreos/kube-state-metrics:v1.3.0
ports:
- name: metrics
containerPort: 8080
@ -33,7 +33,7 @@ spec:
initialDelaySeconds: 5
timeoutSeconds: 5
- name: addon-resizer
image: gcr.io/google_containers/addon-resizer:1.7
image: k8s.gcr.io/addon-resizer:1.7
resources:
limits:
cpu: 100m

View File

@ -63,26 +63,6 @@ data:
description: etcd instance {{ $labels.instance }} has seen {{ $value }} leader
changes within the last hour
summary: a high number of leader changes within the etcd cluster are happening
- alert: HighNumberOfFailedGRPCRequests
expr: sum(rate(grpc_server_handled_total{grpc_code!="OK",job="etcd"}[5m])) BY (grpc_service, grpc_method)
/ sum(rate(grpc_server_handled_total{job="etcd"}[5m])) BY (grpc_service, grpc_method) > 0.01
for: 10m
labels:
severity: warning
annotations:
description: '{{ $value }}% of requests for {{ $labels.grpc_method }} failed
on etcd instance {{ $labels.instance }}'
summary: a high number of gRPC requests are failing
- alert: HighNumberOfFailedGRPCRequests
expr: sum(rate(grpc_server_handled_total{grpc_code!="OK",job="etcd"}[5m])) BY (grpc_service, grpc_method)
/ sum(rate(grpc_server_handled_total{job="etcd"}[5m])) BY (grpc_service, grpc_method) > 0.05
for: 5m
labels:
severity: critical
annotations:
description: '{{ $value }}% of requests for {{ $labels.grpc_method }} failed
on etcd instance {{ $labels.instance }}'
summary: a high number of gRPC requests are failing
- alert: GRPCRequestsSlow
expr: histogram_quantile(0.99, sum(rate(grpc_server_handling_seconds_bucket{job="etcd",grpc_type="unary"}[5m])) by (grpc_service, grpc_method, le))
> 0.15

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.9.5 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Kubernetes v1.10.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/)

View File

@ -1,6 +1,6 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=457b596fa06b6752f25ed320337dcbedcce7f0fb"
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=db36b92abced3c4b0af279adfd5ed4bf0cf8c39f"
cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]

View File

@ -7,12 +7,13 @@ systemd:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.3.2"
Environment="ETCD_IMAGE_TAG=v3.3.3"
Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
Environment="ETCD_LISTEN_CLIENT_URLS=https://0.0.0.0:2379"
Environment="ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380"
Environment="ETCD_LISTEN_METRICS_URLS=http://0.0.0.0:2381"
Environment="ETCD_INITIAL_CLUSTER=${etcd_initial_cluster}"
Environment="ETCD_STRICT_RECONFIG_CHECK=true"
Environment="ETCD_SSL_DIR=/etc/ssl/etcd"
@ -116,8 +117,8 @@ storage:
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.5
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.10.1
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -138,7 +139,7 @@ storage:
# Move experimental manifests
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.11.0}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.12.0}"
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
exec /usr/bin/rkt run \
--trust-keys-from-https \

View File

@ -28,7 +28,7 @@ resource "aws_instance" "controllers" {
# storage
root_block_device {
volume_type = "standard"
volume_type = "${var.disk_type}"
volume_size = "${var.disk_size}"
}
@ -56,10 +56,10 @@ data "template_file" "controller_config" {
# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...
etcd_initial_cluster = "${join(",", formatlist("%s=https://%s:2380", null_resource.repeat.*.triggers.name, null_resource.repeat.*.triggers.domain))}"
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
kubeconfig = "${indent(10, module.bootkube.kubeconfig)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
}
}

View File

@ -51,6 +51,16 @@ resource "aws_security_group_rule" "controller-etcd" {
self = true
}
resource "aws_security_group_rule" "controller-etcd-metrics" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 2381
to_port = 2381
source_security_group_id = "${aws_security_group.worker.id}"
}
resource "aws_security_group_rule" "controller-flannel" {
security_group_id = "${aws_security_group.controller.id}"

View File

@ -1,5 +1,5 @@
# Secure copy etcd TLS assets and kubeconfig to controllers. Activates kubelet.service
resource "null_resource" "copy-secrets" {
# Secure copy etcd TLS assets to controllers.
resource "null_resource" "copy-controller-secrets" {
count = "${var.controller_count}"
connection {
@ -9,11 +9,6 @@ resource "null_resource" "copy-secrets" {
timeout = "15m"
}
provisioner "file" {
content = "${module.bootkube.kubeconfig}"
destination = "$HOME/kubeconfig"
}
provisioner "file" {
content = "${module.bootkube.etcd_ca_cert}"
destination = "$HOME/etcd-client-ca.crt"
@ -61,7 +56,6 @@ resource "null_resource" "copy-secrets" {
"sudo mv etcd-peer.key /etc/ssl/etcd/etcd/peer.key",
"sudo chown -R etcd:etcd /etc/ssl/etcd",
"sudo chmod -R 500 /etc/ssl/etcd",
"sudo mv /home/core/kubeconfig /etc/kubernetes/kubeconfig",
]
}
}
@ -69,7 +63,12 @@ resource "null_resource" "copy-secrets" {
# Secure copy bootkube assets to ONE controller and start bootkube to perform
# one-time self-hosted cluster bootstrapping.
resource "null_resource" "bootkube-start" {
depends_on = ["module.bootkube", "null_resource.copy-secrets", "aws_route53_record.apiserver"]
depends_on = [
"module.bootkube",
"module.workers",
"aws_route53_record.apiserver",
"null_resource.copy-controller-secrets",
]
connection {
type = "ssh"
@ -85,7 +84,7 @@ resource "null_resource" "bootkube-start" {
provisioner "remote-exec" {
inline = [
"sudo mv /home/core/assets /opt/bootkube",
"sudo mv $HOME/assets /opt/bootkube",
"sudo systemctl start bootkube",
]
}

View File

@ -1,21 +1,44 @@
variable "cluster_name" {
type = "string"
description = "Cluster name"
description = "Unique cluster name (prepended to dns_zone)"
}
# AWS
variable "dns_zone" {
type = "string"
description = "AWS DNS Zone (e.g. aws.dghubble.io)"
description = "AWS Route53 DNS Zone (e.g. aws.example.com)"
}
variable "dns_zone_id" {
type = "string"
description = "AWS DNS Zone ID (e.g. Z3PAABBCFAKEC0)"
description = "AWS Route53 DNS Zone ID (e.g. Z3PAABBCFAKEC0)"
}
variable "ssh_authorized_key" {
# instances
variable "controller_count" {
type = "string"
description = "SSH public key for user 'core'"
default = "1"
description = "Number of controllers (i.e. masters)"
}
variable "worker_count" {
type = "string"
default = "1"
description = "Number of workers"
}
variable "controller_type" {
type = "string"
default = "t2.small"
description = "EC2 instance type for controllers"
}
variable "worker_type" {
type = "string"
default = "t2.small"
description = "EC2 instance type for workers"
}
variable "os_channel" {
@ -27,37 +50,13 @@ variable "os_channel" {
variable "disk_size" {
type = "string"
default = "40"
description = "The size of the disk in Gigabytes"
description = "Size of the EBS volume in GB"
}
variable "host_cidr" {
description = "CIDR IPv4 range to assign to EC2 nodes"
variable "disk_type" {
type = "string"
default = "10.0.0.0/16"
}
variable "controller_count" {
type = "string"
default = "1"
description = "Number of controllers"
}
variable "controller_type" {
type = "string"
default = "t2.small"
description = "Controller EC2 instance type"
}
variable "worker_count" {
type = "string"
default = "1"
description = "Number of workers"
}
variable "worker_type" {
type = "string"
default = "t2.small"
description = "Worker EC2 instance type"
default = "gp2"
description = "Type of the EBS volume (e.g. standard, gp2, io1)"
}
variable "controller_clc_snippets" {
@ -72,7 +71,12 @@ variable "worker_clc_snippets" {
default = []
}
# bootkube assets
# configuration
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key for user 'core'"
}
variable "asset_dir" {
description = "Path to a directory where generated assets should be placed (contains secrets)"
@ -91,6 +95,12 @@ variable "network_mtu" {
default = "1480"
}
variable "host_cidr" {
description = "CIDR IPv4 range to assign to EC2 nodes"
type = "string"
default = "10.0.0.0/16"
}
variable "pod_cidr" {
description = "CIDR IPv4 range to assign Kubernetes pods"
type = "string"

View File

@ -39,8 +39,6 @@ systemd:
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
@ -89,8 +87,8 @@ storage:
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.5
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.10.1
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -108,7 +106,7 @@ storage:
--volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \
--insecure-options=image \
docker://gcr.io/google_containers/hyperkube:v1.9.5 \
docker://k8s.gcr.io/hyperkube:v1.10.1 \
--net=host \
--dns=host \
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)

View File

@ -1,21 +1,23 @@
variable "name" {
type = "string"
description = "Unique name instance group"
description = "Unique name for the worker pool"
}
# AWS
variable "vpc_id" {
type = "string"
description = "ID of the VPC for creating instances"
description = "Must be set to `vpc_id` output by cluster"
}
variable "subnet_ids" {
type = "list"
description = "List of subnet IDs for creating instances"
description = "Must be set to `subnet_ids` output by cluster"
}
variable "security_groups" {
type = "list"
description = "List of security group IDs"
description = "Must be set to `worker_security_groups` output by cluster"
}
# instances
@ -41,14 +43,26 @@ variable "os_channel" {
variable "disk_size" {
type = "string"
default = "40"
description = "Size of the disk in GB"
description = "Size of the EBS volume in GB"
}
variable "disk_type" {
type = "string"
default = "gp2"
description = "Type of the EBS volume (e.g. standard, gp2, io1)"
}
variable "clc_snippets" {
type = "list"
description = "Container Linux Config snippets"
default = []
}
# configuration
variable "kubeconfig" {
type = "string"
description = "Generated Kubelet kubeconfig"
description = "Must be set to `kubeconfig` output by cluster"
}
variable "ssh_authorized_key" {
@ -71,9 +85,3 @@ variable "cluster_domain_suffix" {
type = "string"
default = "cluster.local"
}
variable "clc_snippets" {
type = "list"
description = "Container Linux Config snippets"
default = []
}

View File

@ -42,7 +42,7 @@ resource "aws_launch_configuration" "worker" {
# storage
root_block_device {
volume_type = "standard"
volume_type = "${var.disk_type}"
volume_size = "${var.disk_size}"
}

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.9.5 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Kubernetes v1.10.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)

View File

@ -1,6 +1,6 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=457b596fa06b6752f25ed320337dcbedcce7f0fb"
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=db36b92abced3c4b0af279adfd5ed4bf0cf8c39f"
cluster_name = "${var.cluster_name}"
api_servers = ["${var.k8s_domain_name}"]

View File

@ -12,6 +12,16 @@ systemd:
ExecStart=/opt/installer
[Install]
WantedBy=multi-user.target
# Avoid using the standard SSH port so terraform apply cannot SSH until
# post-install. But admins may SSH to debug disk install problems.
# After install, sshd will use port 22 and users/terraform can connect.
- name: sshd.socket
dropins:
- name: 10-sshd-port.conf
contents: |
[Socket]
ListenStream=
ListenStream=2222
storage:
files:
- path: /opt/installer
@ -32,11 +42,6 @@ storage:
systemctl reboot
passwd:
users:
# Avoid using standard name "core" so terraform apply cannot SSH until post-install.
- name: debug
create:
groups:
- sudo
- docker
- name: core
ssh_authorized_keys:
- {{.ssh_authorized_key}}
- "${ssh_authorized_key}"

View File

@ -7,12 +7,13 @@ systemd:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.3.2"
Environment="ETCD_IMAGE_TAG=v3.3.3"
Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${domain_name}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${domain_name}:2380"
Environment="ETCD_LISTEN_CLIENT_URLS=https://0.0.0.0:2379"
Environment="ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380"
Environment="ETCD_LISTEN_METRICS_URLS=http://0.0.0.0:2381"
Environment="ETCD_INITIAL_CLUSTER=${etcd_initial_cluster}"
Environment="ETCD_STRICT_RECONFIG_CHECK=true"
Environment="ETCD_SSL_DIR=/etc/ssl/etcd"
@ -117,8 +118,8 @@ storage:
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.5
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.10.1
- path: /etc/hostname
filesystem: root
mode: 0644
@ -145,7 +146,7 @@ storage:
# Move experimental manifests
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.11.0}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.12.0}"
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
exec /usr/bin/rkt run \
--trust-keys-from-https \

View File

@ -47,8 +47,6 @@ systemd:
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
@ -81,8 +79,8 @@ storage:
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.5
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.10.1
- path: /etc/hostname
filesystem: root
mode: 0644

View File

@ -8,10 +8,6 @@ resource "matchbox_group" "container-linux-install" {
selector {
mac = "${element(concat(var.controller_macs, var.worker_macs), count.index)}"
}
metadata {
ssh_authorized_key = "${var.ssh_authorized_key}"
}
}
resource "matchbox_group" "controller" {

View File

@ -32,6 +32,7 @@ data "template_file" "container-linux-install-configs" {
ignition_endpoint = "${format("%s/ignition", var.matchbox_http_endpoint)}"
install_disk = "${var.install_disk}"
container_linux_oem = "${var.container_linux_oem}"
ssh_authorized_key = "${var.ssh_authorized_key}"
# only cached-container-linux profile adds -b baseurl
baseurl_flag = ""
@ -73,6 +74,7 @@ data "template_file" "cached-container-linux-install-configs" {
ignition_endpoint = "${format("%s/ignition", var.matchbox_http_endpoint)}"
install_disk = "${var.install_disk}"
container_linux_oem = "${var.container_linux_oem}"
ssh_authorized_key = "${var.ssh_authorized_key}"
# profile uses -b baseurl to install from matchbox cache
baseurl_flag = "-b ${var.matchbox_http_endpoint}/assets/coreos"

View File

@ -1,5 +1,5 @@
# Secure copy etcd TLS assets and kubeconfig to controllers. Activates kubelet.service
resource "null_resource" "copy-etcd-secrets" {
resource "null_resource" "copy-controller-secrets" {
count = "${length(var.controller_names)}"
connection {
@ -61,13 +61,13 @@ resource "null_resource" "copy-etcd-secrets" {
"sudo mv etcd-peer.key /etc/ssl/etcd/etcd/peer.key",
"sudo chown -R etcd:etcd /etc/ssl/etcd",
"sudo chmod -R 500 /etc/ssl/etcd",
"sudo mv /home/core/kubeconfig /etc/kubernetes/kubeconfig",
"sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
]
}
}
# Secure copy kubeconfig to all workers. Activates kubelet.service
resource "null_resource" "copy-kubeconfig" {
resource "null_resource" "copy-worker-secrets" {
count = "${length(var.worker_names)}"
connection {
@ -84,7 +84,7 @@ resource "null_resource" "copy-kubeconfig" {
provisioner "remote-exec" {
inline = [
"sudo mv /home/core/kubeconfig /etc/kubernetes/kubeconfig",
"sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
]
}
}
@ -95,13 +95,16 @@ resource "null_resource" "bootkube-start" {
# Without depends_on, this remote-exec may start before the kubeconfig copy.
# Terraform only does one task at a time, so it would try to bootstrap
# while no Kubelets are running.
depends_on = ["null_resource.copy-etcd-secrets", "null_resource.copy-kubeconfig"]
depends_on = [
"null_resource.copy-controller-secrets",
"null_resource.copy-worker-secrets",
]
connection {
type = "ssh"
host = "${element(var.controller_domains, 0)}"
user = "core"
timeout = "30m"
timeout = "15m"
}
provisioner "file" {
@ -111,7 +114,7 @@ resource "null_resource" "bootkube-start" {
provisioner "remote-exec" {
inline = [
"sudo mv /home/core/assets /opt/bootkube",
"sudo mv $HOME/assets /opt/bootkube",
"sudo systemctl start bootkube",
]
}

View File

@ -1,3 +1,10 @@
variable "cluster_name" {
type = "string"
description = "Unique cluster name"
}
# bare-metal
variable "matchbox_http_endpoint" {
type = "string"
description = "Matchbox HTTP read-only endpoint (e.g. http://matchbox.example.com:8080)"
@ -13,17 +20,7 @@ variable "container_linux_version" {
description = "Container Linux version of the kernel/initrd to PXE or the image to install"
}
variable "cluster_name" {
type = "string"
description = "Cluster name"
}
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key to set as an authorized_key on machines"
}
# Machines
# machines
# Terraform's crude "type system" does not properly support lists of maps so we do this.
variable "controller_names" {
@ -50,13 +47,18 @@ variable "worker_domains" {
type = "list"
}
# bootkube assets
# configuration
variable "k8s_domain_name" {
description = "Controller DNS name which resolves to a controller instance. Workers and kubeconfig's will communicate with this endpoint (e.g. cluster.example.com)"
type = "string"
}
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key for user 'core'"
}
variable "asset_dir" {
description = "Path to a directory where generated assets should be placed (contains secrets)"
type = "string"
@ -75,14 +77,14 @@ variable "network_mtu" {
}
variable "pod_cidr" {
description = "CIDR IP range to assign Kubernetes pods"
description = "CIDR IPv4 range to assign Kubernetes pods"
type = "string"
default = "10.2.0.0/16"
}
variable "service_cidr" {
description = <<EOD
CIDR IP range to assign Kubernetes services.
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
EOD

View File

@ -1,117 +0,0 @@
---
systemd:
units:
- name: docker.service
enable: true
- name: locksmithd.service
mask: true
- name: kubelet.path
enable: true
contents: |
[Unit]
Description=Watch for kubeconfig
[Path]
PathExists=/etc/kubernetes/kubeconfig
[Install]
WantedBy=multi-user.target
- name: wait-for-dns.service
enable: true
contents: |
[Unit]
Description=Wait for DNS entries
Wants=systemd-resolved.service
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/sh -c 'while ! /usr/bin/grep '^[^#[:space:]]' /etc/resolv.conf > /dev/null; do sleep 1; done'
[Install]
RequiredBy=kubelet.service
- name: kubelet.service
contents: |
[Unit]
Description=Kubelet via Hyperkube
Wants=rpc-statd.service
[Service]
EnvironmentFile=/etc/kubernetes/kubelet.env
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/cache/kubelet-pod.uuid \
--volume=resolv,kind=host,source=/etc/resolv.conf \
--mount volume=resolv,target=/etc/resolv.conf \
--volume var-lib-cni,kind=host,source=/var/lib/cni \
--mount volume=var-lib-cni,target=/var/lib/cni \
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
--mount volume=opt-cni-bin,target=/opt/cni/bin \
--volume var-log,kind=host,source=/var/log \
--mount volume=var-log,target=/var/log \
--insecure-options=image"
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
--allow-privileged \
--anonymous-auth=false \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns={{.k8s_dns_service_ip}} \
--cluster_domain={{.cluster_domain_suffix}} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--hostname-override={{.domain_name}} \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni \
--node-labels=node-role.kubernetes.io/node \
--pod-manifest-path=/etc/kubernetes/manifests \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
storage:
{{ if index . "pxe" }}
disks:
- device: /dev/sda
wipe_table: true
partitions:
- label: ROOT
filesystems:
- name: root
mount:
device: "/dev/sda1"
format: "ext4"
create:
force: true
options:
- "-LROOT"
{{end}}
files:
- path: /etc/kubernetes/kubelet.env
filesystem: root
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.5
- path: /etc/hostname
filesystem: root
mode: 0644
contents:
inline:
{{.domain_name}}
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
inline: |
fs.inotify.max_user_watches=16184
passwd:
users:
- name: core
ssh_authorized_keys:
- {{.ssh_authorized_key}}

View File

@ -1,19 +0,0 @@
resource "matchbox_group" "workers" {
count = "${length(var.worker_names)}"
name = "${format("%s-%s", var.cluster_name, element(var.worker_names, count.index))}"
profile = "${matchbox_profile.bootkube-worker-pxe.name}"
selector {
mac = "${element(var.worker_macs, count.index)}"
}
metadata {
pxe = "true"
domain_name = "${element(var.worker_domains, count.index)}"
etcd_endpoints = "${join(",", formatlist("%s:2379", var.controller_domains))}"
k8s_dns_service_ip = "${var.kube_dns_service_ip}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
ssh_authorized_key = "${var.ssh_authorized_key}"
}
}

View File

@ -1,20 +0,0 @@
// Container Linux Install profile (from release.core-os.net)
resource "matchbox_profile" "bootkube-worker-pxe" {
name = "bootkube-worker-pxe"
kernel = "http://${var.container_linux_channel}.release.core-os.net/amd64-usr/${var.container_linux_version}/coreos_production_pxe.vmlinuz"
initrd = [
"http://${var.container_linux_channel}.release.core-os.net/amd64-usr/${var.container_linux_version}/coreos_production_pxe_image.cpio.gz",
]
args = [
"initrd=coreos_production_pxe_image.cpio.gz",
"coreos.config.url=${var.matchbox_http_endpoint}/ignition?uuid=$${uuid}&mac=$${mac:hexhyp}",
"coreos.first_boot=yes",
"console=tty0",
"console=ttyS0",
"${var.kernel_args}",
]
container_linux_config = "${file("${path.module}/cl/bootkube-worker.yaml.tmpl")}"
}

View File

@ -1,22 +0,0 @@
# Secure copy kubeconfig to all nodes to activate kubelet.service
resource "null_resource" "copy-kubeconfig" {
count = "${length(var.worker_names)}"
connection {
type = "ssh"
host = "${element(var.worker_domains, count.index)}"
user = "core"
timeout = "60m"
}
provisioner "file" {
content = "${var.kubeconfig}"
destination = "$HOME/kubeconfig"
}
provisioner "remote-exec" {
inline = [
"sudo mv /home/core/kubeconfig /etc/kubernetes/kubeconfig",
]
}
}

View File

@ -1,72 +0,0 @@
variable "cluster_name" {
description = "Cluster name"
type = "string"
}
variable "matchbox_http_endpoint" {
type = "string"
description = "Matchbox HTTP read-only endpoint (e.g. http://matchbox.example.com:8080)"
}
variable "container_linux_channel" {
type = "string"
description = "Container Linux channel corresponding to the container_linux_version"
}
variable "container_linux_version" {
type = "string"
description = "Container Linux version of the kernel/initrd to PXE or the image to install"
}
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key to set as an authorized key"
}
# machines
# Terraform's crude "type system" does properly support lists of maps so we do this.
variable "controller_domains" {
type = "list"
}
variable "worker_names" {
type = "list"
}
variable "worker_macs" {
type = "list"
}
variable "worker_domains" {
type = "list"
}
# bootkube
variable "kubeconfig" {
type = "string"
}
variable "kube_dns_service_ip" {
description = "Kubernetes service IP for kube-dns (must be within server_cidr)"
type = "string"
default = "10.3.0.10"
}
# optional
variable "kernel_args" {
description = "Additional kernel arguments to provide at PXE boot."
type = "list"
default = [
"root=/dev/sda1",
]
}
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by kube-dns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = "string"
default = "cluster.local"
}

View File

@ -11,8 +11,8 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.9.5 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* Kubernetes v1.10.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)

View File

@ -1,12 +1,12 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=457b596fa06b6752f25ed320337dcbedcce7f0fb"
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=db36b92abced3c4b0af279adfd5ed4bf0cf8c39f"
cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
etcd_servers = "${digitalocean_record.etcds.*.fqdn}"
asset_dir = "${var.asset_dir}"
networking = "${var.networking}"
networking = "flannel"
network_mtu = 1440
pod_cidr = "${var.pod_cidr}"
service_cidr = "${var.service_cidr}"

View File

@ -7,12 +7,13 @@ systemd:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.3.2"
Environment="ETCD_IMAGE_TAG=v3.3.3"
Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
Environment="ETCD_LISTEN_CLIENT_URLS=https://0.0.0.0:2379"
Environment="ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380"
Environment="ETCD_LISTEN_METRICS_URLS=http://0.0.0.0:2381"
Environment="ETCD_INITIAL_CLUSTER=${etcd_initial_cluster}"
Environment="ETCD_STRICT_RECONFIG_CHECK=true"
Environment="ETCD_SSL_DIR=/etc/ssl/etcd"
@ -122,8 +123,8 @@ storage:
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.5
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.10.1
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -144,7 +145,7 @@ storage:
# Move experimental manifests
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.11.0}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.12.0}"
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
exec /usr/bin/rkt run \
--trust-keys-from-https \

View File

@ -50,8 +50,6 @@ systemd:
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
@ -95,8 +93,8 @@ storage:
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.5
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.10.1
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -114,7 +112,7 @@ storage:
--volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \
--insecure-options=image \
docker://gcr.io/google_containers/hyperkube:v1.9.5 \
docker://k8s.gcr.io/hyperkube:v1.10.1 \
--net=host \
--dns=host \
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)

View File

@ -1,10 +1,10 @@
# Secure copy kubeconfig to all nodes. Activates kubelet.service
resource "null_resource" "copy-secrets" {
count = "${var.controller_count + var.worker_count}"
# Secure copy etcd TLS assets and kubeconfig to controllers. Activates kubelet.service
resource "null_resource" "copy-controller-secrets" {
count = "${var.controller_count}"
connection {
type = "ssh"
host = "${element(concat(digitalocean_droplet.controllers.*.ipv4_address, digitalocean_droplet.workers.*.ipv4_address), count.index)}"
host = "${element(concat(digitalocean_droplet.controllers.*.ipv4_address), count.index)}"
user = "core"
timeout = "15m"
}
@ -61,7 +61,30 @@ resource "null_resource" "copy-secrets" {
"sudo mv etcd-peer.key /etc/ssl/etcd/etcd/peer.key",
"sudo chown -R etcd:etcd /etc/ssl/etcd",
"sudo chmod -R 500 /etc/ssl/etcd",
"sudo mv /home/core/kubeconfig /etc/kubernetes/kubeconfig",
"sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
]
}
}
# Secure copy kubeconfig to all workers. Activates kubelet.service.
resource "null_resource" "copy-worker-secrets" {
count = "${var.worker_count}"
connection {
type = "ssh"
host = "${element(concat(digitalocean_droplet.workers.*.ipv4_address), count.index)}"
user = "core"
timeout = "15m"
}
provisioner "file" {
content = "${module.bootkube.kubeconfig}"
destination = "$HOME/kubeconfig"
}
provisioner "remote-exec" {
inline = [
"sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
]
}
}
@ -69,7 +92,11 @@ resource "null_resource" "copy-secrets" {
# Secure copy bootkube assets to ONE controller and start bootkube to perform
# one-time self-hosted cluster bootstrapping.
resource "null_resource" "bootkube-start" {
depends_on = ["module.bootkube", "null_resource.copy-secrets"]
depends_on = [
"module.bootkube",
"null_resource.copy-controller-secrets",
"null_resource.copy-worker-secrets",
]
connection {
type = "ssh"
@ -85,7 +112,7 @@ resource "null_resource" "bootkube-start" {
provisioner "remote-exec" {
inline = [
"sudo mv /home/core/assets /opt/bootkube",
"sudo mv $HOME/assets /opt/bootkube",
"sudo systemctl start bootkube",
]
}

View File

@ -1,8 +1,10 @@
variable "cluster_name" {
type = "string"
description = "Unique cluster name"
description = "Unique cluster name (prepended to dns_zone)"
}
# Digital Ocean
variable "region" {
type = "string"
description = "Digital Ocean region (e.g. nyc1, sfo2, fra1, tor1)"
@ -13,22 +15,12 @@ variable "dns_zone" {
description = "Digital Ocean domain (i.e. DNS zone) (e.g. do.example.com)"
}
variable "image" {
type = "string"
default = "coreos-stable"
description = "OS image from which to initialize the disk (e.g. coreos-stable)"
}
# instances
variable "controller_count" {
type = "string"
default = "1"
description = "Number of controllers"
}
variable "controller_type" {
type = "string"
default = "s-2vcpu-2gb"
description = "Digital Ocean droplet size (e.g. s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb)."
description = "Number of controllers (i.e. masters)"
}
variable "worker_count" {
@ -37,15 +29,22 @@ variable "worker_count" {
description = "Number of workers"
}
variable "controller_type" {
type = "string"
default = "s-2vcpu-2gb"
description = "Droplet type for controllers (e.g. s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb)."
}
variable "worker_type" {
type = "string"
default = "s-1vcpu-1gb"
description = "Digital Ocean droplet size (e.g. s-1vcpu-1gb, s-1vcpu-2gb, s-2vcpu-2gb)"
description = "Droplet type for workers (e.g. s-1vcpu-1gb, s-1vcpu-2gb, s-2vcpu-2gb)"
}
variable "ssh_fingerprints" {
type = "list"
description = "SSH public key fingerprints. (e.g. see `ssh-add -l -E md5`)"
variable "image" {
type = "string"
default = "coreos-stable"
description = "Container Linux image for instances (e.g. coreos-stable)"
}
variable "controller_clc_snippets" {
@ -60,28 +59,27 @@ variable "worker_clc_snippets" {
default = []
}
# bootkube assets
# configuration
variable "ssh_fingerprints" {
type = "list"
description = "SSH public key fingerprints. (e.g. see `ssh-add -l -E md5`)"
}
variable "asset_dir" {
description = "Path to a directory where generated assets should be placed (contains secrets)"
type = "string"
}
variable "networking" {
description = "Choice of networking provider (flannel or calico)"
type = "string"
default = "flannel"
}
variable "pod_cidr" {
description = "CIDR IP range to assign Kubernetes pods"
description = "CIDR IPv4 range to assign Kubernetes pods"
type = "string"
default = "10.2.0.0/16"
}
variable "service_cidr" {
description = <<EOD
CIDR IP range to assign Kubernetes services.
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
EOD

View File

@ -13,7 +13,7 @@ Create a cluster following the AWS [tutorial](../aws.md#cluster). Define a worke
```tf
module "tempest-worker-pool" {
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes/workers?ref=v1.9.5"
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes/workers?ref=v1.10.1"
providers = {
aws = "aws.default"
@ -56,7 +56,7 @@ The AWS internal `workers` module supports a number of [variables](https://githu
| security_groups | Must be set to `worker_security_groups` output by cluster | "${module.cluster.worker_security_groups}" |
| name | Unique name (distinct from cluster name) | "tempest-m5s" |
| kubeconfig | Must be set to `kubeconfig` output by cluster | "${module.cluster.kubeconfig}" |
| ssh_authorized_key | SSH public key for ~/.ssh_authorized_keys | "ssh-rsa AAAAB3NZ..." |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." |
#### Optional
@ -77,7 +77,7 @@ Create a cluster following the Google Cloud [tutorial](../google-cloud.md#cluste
```tf
module "yavin-worker-pool" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes/workers?ref=v1.9.5"
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes/workers?ref=v1.10.1"
providers = {
google = "google.default"
@ -111,11 +111,11 @@ Verify a managed instance group of workers joins the cluster within a few minute
```
$ kubectl get nodes
NAME STATUS AGE VERSION
yavin-controller-0.c.example-com.internal Ready 6m v1.9.5
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.9.5
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.9.5
yavin-16x-worker-jrbf.c.example-com.internal Ready 3m v1.9.5
yavin-16x-worker-mzdm.c.example-com.internal Ready 3m v1.9.5
yavin-controller-0.c.example-com.internal Ready 6m v1.10.1
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.10.1
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.10.1
yavin-16x-worker-jrbf.c.example-com.internal Ready 3m v1.10.1
yavin-16x-worker-mzdm.c.example-com.internal Ready 3m v1.10.1
```
### Variables
@ -131,7 +131,7 @@ The Google Cloud internal `workers` module supports a number of [variables](http
| name | Unique name (distinct from cluster name) | "yavin-16x" |
| cluster_name | Must be set to `cluster_name` of cluster | "yavin" |
| kubeconfig | Must be set to `kubeconfig` output by cluster | "${module.cluster.kubeconfig}" |
| ssh_authorized_key | SSH public key for ~/.ssh_authorized_keys | "ssh-rsa AAAAB3NZ..." |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." |
#### Optional
@ -139,7 +139,7 @@ The Google Cloud internal `workers` module supports a number of [variables](http
|:-----|:------------|:--------|:--------|
| count | Number of instances | 1 | 3 |
| machine_type | Compute instance machine type | "n1-standard-1" | See below |
| os_image | OS image for compute instances | "coreos-stable" | "coreos-alpha", "coreos-beta" |
| os_image | Container Linux image for compute instances | "coreos-stable" | "coreos-alpha", "coreos-beta" |
| disk_size | Size of the disk in GB | 40 | 100 |
| preemptible | If true, Compute Engine will terminate instances randomly within 24 hours | false | true |
| service_cidr | Must match `service_cidr` of cluster | "10.3.0.0/16" | "10.3.0.0/24" |

View File

@ -1,6 +1,6 @@
# AWS
In this tutorial, we'll create a Kubernetes v1.9.5 cluster on AWS.
In this tutorial, we'll create a Kubernetes v1.10.1 cluster on AWS.
We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, a VPC, gateway, subnets, auto-scaling groups of controllers and workers, network load balancers for controllers and workers, and security groups will be created.
@ -57,7 +57,7 @@ Configure the AWS provider to use your access key credentials in a `providers.tf
```tf
provider "aws" {
version = "~> 1.5.0"
version = "~> 1.11.0"
alias = "default"
region = "eu-central-1"
@ -96,7 +96,7 @@ Define a Kubernetes cluster using the module `aws/container-linux/kubernetes`.
```tf
module "aws-tempest" {
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes?ref=v1.9.5"
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes?ref=v1.10.1"
providers = {
aws = "aws.default"
@ -105,20 +105,19 @@ module "aws-tempest" {
template = "template.default"
tls = "tls.default"
}
cluster_name = "tempest"
# AWS
dns_zone = "aws.example.com"
dns_zone_id = "Z3PAABBCFAKEC0"
controller_count = 1
controller_type = "t2.medium"
worker_count = 2
worker_type = "t2.small"
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
cluster_name = "tempest"
dns_zone = "aws.example.com"
dns_zone_id = "Z3PAABBCFAKEC0"
# bootkube
asset_dir = "/home/user/.secrets/clusters/tempest"
# configuration
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
asset_dir = "/home/user/.secrets/clusters/tempest"
# optional
worker_count = 2
worker_type = "t2.medium"
}
```
@ -150,7 +149,7 @@ Get or update Terraform modules.
$ terraform get # downloads missing modules
$ terraform get --update # updates all modules
Get: git::https://github.com/poseidon/typhoon (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.11.0 (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.12.0 (update)
```
Plan the resources to be created.
@ -182,9 +181,9 @@ In 4-8 minutes, the Kubernetes cluster will be ready.
$ export KUBECONFIG=/home/user/.secrets/clusters/tempest/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
ip-10-0-12-221 Ready 34m v1.9.5
ip-10-0-19-112 Ready 34m v1.9.5
ip-10-0-4-22 Ready 34m v1.9.5
ip-10-0-12-221 Ready 34m v1.10.1
ip-10-0-19-112 Ready 34m v1.10.1
ip-10-0-4-22 Ready 34m v1.10.1
```
List the pods.
@ -249,19 +248,20 @@ Reference the DNS zone id with `"${aws_route53_zone.zone-for-clusters.zone_id}"`
| Name | Description | Default | Example |
|:-----|:------------|:--------|:--------|
| controller_count | Number of controllers (i.e. masters) | 1 | 1 |
| controller_type | Controller EC2 instance type | "t2.small" | "t2.medium" |
| worker_count | Number of workers | 1 | 3 |
| worker_type | Worker EC2 instance type | "t2.small" | "t2.medium" |
| controller_type | EC2 instance type for controllers | "t2.small" | See below |
| worker_type | EC2 instance type for workers | "t2.small" | See below |
| os_channel | Container Linux AMI channel | stable | stable, beta, alpha |
| disk_size | Size of the EBS volume in GB | "40" | "100" |
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
| network_mtu | CNI interface MTU (calico only) | 1480 | 8981 |
| host_cidr | CIDR range to assign to EC2 instances | "10.0.0.0/16" | "10.1.0.0/16" |
| pod_cidr | CIDR range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by kube-dns. | "cluster.local" | "k8s.example.com" |
| disk_type | Type of the EBS volume | "gp2" | standard, gp2, io1 |
| controller_clc_snippets | Controller Container Linux Config snippets | [] | |
| worker_clc_snippets | Worker Container Linux Config snippets | [] | |
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
| network_mtu | CNI interface MTU (calico only) | 1480 | 8981 |
| host_cidr | CIDR IPv4 range to assign to EC2 instances | "10.0.0.0/16" | "10.1.0.0/16" |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by kube-dns. | "cluster.local" | "k8s.example.com" |
Check the list of valid [instance types](https://aws.amazon.com/ec2/instance-types/).

View File

@ -1,6 +1,6 @@
# Bare-Metal
In this tutorial, we'll network boot and provision a Kubernetes v1.9.5 cluster on bare-metal.
In this tutorial, we'll network boot and provision a Kubernetes v1.10.1 cluster on bare-metal.
First, we'll deploy a [Matchbox](https://github.com/coreos/matchbox) service and setup a network boot environment. Then, we'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Container Linux to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers.
@ -22,10 +22,7 @@ Collect a MAC address from each machine. For machines with multiple PXE-enabled
* 52:54:00:b2:2f:86 (node2)
* 52:54:00:c3:61:77 (node3)
Configure each machine to boot from the disk [^1] through IPMI or the BIOS menu.
[^1]: Configuring "diskless" workers that always PXE boot is possible, but not in the scope of this tutorial.
Configure each machine to boot from the disk through IPMI or the BIOS menu.
```
ipmitool -H node1 -U USER -P PASS chassis bootdev disk options=persistent
@ -177,7 +174,7 @@ Define a Kubernetes cluster using the module `bare-metal/container-linux/kuberne
```tf
module "bare-metal-mercury" {
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.9.5"
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.10.1"
providers = {
local = "local.default"
@ -186,15 +183,16 @@ module "bare-metal-mercury" {
tls = "tls.default"
}
# install
# bare-metal
cluster_name = "mercury"
matchbox_http_endpoint = "http://matchbox.example.com"
container_linux_channel = "stable"
container_linux_version = "1632.3.0"
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
# cluster
cluster_name = "mercury"
k8s_domain_name = "node1.example.com"
# configuration
k8s_domain_name = "node1.example.com"
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
asset_dir = "/home/user/.secrets/clusters/mercury"
# machines
controller_names = ["node1"]
@ -212,9 +210,6 @@ module "bare-metal-mercury" {
"node2.example.com",
"node3.example.com",
]
# output assets dir
asset_dir = "/home/user/.secrets/clusters/mercury"
}
```
@ -246,7 +241,7 @@ Get or update Terraform modules.
$ terraform get # downloads missing modules
$ terraform get --update # updates all modules
Get: git::https://github.com/poseidon/typhoon (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.11.0 (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.12.0 (update)
```
Plan the resources to be created.
@ -297,10 +292,19 @@ module.bare-metal-mercury.null_resource.bootkube-start: Creation complete (ID: 5
Apply complete! Resources: 55 added, 0 changed, 0 destroyed.
```
To watch the install to disk (until machines reboot from disk), SSH to port 2222.
```
# before v1.10.1
$ ssh debug@node1.example.com
# after v1.10.1
$ ssh -p 2222 core@node1.example.com
```
To watch the bootstrap process in detail, SSH to the first controller and journal the logs.
```
$ ssh node1.example.com
$ ssh core@node1.example.com
$ journalctl -f -u bootkube
bootkube[5]: Pod Status: pod-checkpointer Running
bootkube[5]: Pod Status: kube-apiserver Running
@ -318,9 +322,9 @@ bootkube[5]: Tearing down temporary bootstrap control plane...
$ export KUBECONFIG=/home/user/.secrets/clusters/mercury/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
node1.example.com Ready 11m v1.9.5
node2.example.com Ready 11m v1.9.5
node3.example.com Ready 11m v1.9.5
node1.example.com Ready 11m v1.10.1
node2.example.com Ready 11m v1.10.1
node3.example.com Ready 11m v1.10.1
```
List the pods.
@ -357,19 +361,19 @@ Learn about [maintenance](topics/maintenance.md) and [addons](addons/overview.md
| Name | Description | Example |
|:-----|:------------|:--------|
| cluster_name | Unique cluster name | mercury |
| matchbox_http_endpoint | Matchbox HTTP read-only endpoint | http://matchbox.example.com:8080 |
| container_linux_channel | Container Linux channel | stable, beta, alpha |
| container_linux_version | Container Linux version of the kernel/initrd to PXE and the image to install | 1632.3.0 |
| cluster_name | Cluster name | mercury |
| k8s_domain_name | FQDN resolving to the controller(s) nodes. Workers and kubectl will communicate with this endpoint | "myk8s.example.com" |
| ssh_authorized_key | SSH public key for ~/.ssh/authorized_keys | "ssh-rsa AAAAB3Nz..." |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3Nz..." |
| asset_dir | Path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/mercury" |
| controller_names | Ordered list of controller short names | ["node1"] |
| controller_macs | Ordered list of controller identifying MAC addresses | ["52:54:00:a1:9c:ae"] |
| controller_domains | Ordered list of controller FQDNs | ["node1.example.com"] |
| worker_names | Ordered list of worker short names | ["node2", "node3"] |
| worker_macs | Ordered list of worker identifying MAC addresses | ["52:54:00:b2:2f:86", "52:54:00:c3:61:77"] |
| worker_domains | Ordered list of worker FQDNs | ["node2.example.com", "node3.example.com"] |
| asset_dir | Path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/mercury" |
### Optional
@ -380,8 +384,8 @@ Learn about [maintenance](topics/maintenance.md) and [addons](addons/overview.md
| container_linux_oem | Specify alternative OEM image ids for the disk install | "" | "vmware_raw", "xen" |
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
| network_mtu | CNI interface MTU (calico-only) | 1480 | - |
| pod_cidr | CIDR range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by kube-dns. | "cluster.local" | "k8s.example.com" |
| kernel_args | Additional kernel args to provide at PXE boot | [] | "kvm-intel.nested=1" |

View File

@ -1,6 +1,6 @@
# Digital Ocean
In this tutorial, we'll create a Kubernetes v1.9.5 cluster on Digital Ocean.
In this tutorial, we'll create a Kubernetes v1.10.1 cluster on Digital Ocean.
We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, firewall rules, DNS records, tags, and droplets for Kubernetes controllers and workers will be created.
@ -90,7 +90,7 @@ Define a Kubernetes cluster using the module `digital-ocean/container-linux/kube
```tf
module "digital-ocean-nemo" {
source = "git::https://github.com/poseidon/typhoon//digital-ocean/container-linux/kubernetes?ref=v1.9.5"
source = "git::https://github.com/poseidon/typhoon//digital-ocean/container-linux/kubernetes?ref=v1.10.1"
providers = {
digitalocean = "digitalocean.default"
@ -100,19 +100,18 @@ module "digital-ocean-nemo" {
tls = "tls.default"
}
region = "nyc3"
dns_zone = "digital-ocean.example.com"
# Digital Ocean
cluster_name = "nemo"
region = "nyc3"
dns_zone = "digital-ocean.example.com"
cluster_name = "nemo"
image = "coreos-stable"
controller_count = 1
controller_type = "s-2vcpu-2gb"
worker_count = 2
worker_type = "s-1vcpu-1gb"
# configuration
ssh_fingerprints = ["d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7"]
# output assets dir
asset_dir = "/home/user/.secrets/clusters/nemo"
asset_dir = "/home/user/.secrets/clusters/nemo"
# optional
worker_count = 2
worker_type = "s-1vcpu-1gb"
}
```
@ -144,7 +143,7 @@ Get or update Terraform modules.
$ terraform get # downloads missing modules
$ terraform get --update # updates all modules
Get: git::https://github.com/poseidon/typhoon (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.11.0 (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.12.0 (update)
```
Plan the resources to be created.
@ -177,9 +176,9 @@ In 3-6 minutes, the Kubernetes cluster will be ready.
$ export KUBECONFIG=/home/user/.secrets/clusters/nemo/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
10.132.110.130 Ready 10m v1.9.5
10.132.115.81 Ready 10m v1.9.5
10.132.124.107 Ready 10m v1.9.5
10.132.110.130 Ready 10m v1.10.1
10.132.115.81 Ready 10m v1.10.1
10.132.124.107 Ready 10m v1.10.1
```
List the pods.
@ -261,21 +260,17 @@ Digital Ocean requires the SSH public key be uploaded to your account, so you ma
| Name | Description | Default | Example |
|:-----|:------------|:--------|:--------|
| controller_count | Number of controllers (i.e. masters) | 1 | 1 |
| controller_type | Digital Ocean droplet size | s-2vcpu-2gb | s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb, ... |
| worker_count | Number of workers | 1 | 3 |
| worker_type | Digital Ocean droplet size | s-1vcpu-1gb | s-1vcpu-1gb, s-1vcpu-2gb, s-2vcpu-2gb, ... |
| image | OS image for droplets | "coreos-stable" | coreos-stable, coreos-beta, coreos-alpha |
| networking | Choice of networking provider | "flannel" | "flannel" |
| pod_cidr | CIDR range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by kube-dns. | "cluster.local" | "k8s.example.com" |
| controller_type | Droplet type for controllers | s-2vcpu-2gb | s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb, ... |
| worker_type | Droplet type for workers | s-1vcpu-1gb | s-1vcpu-1gb, s-1vcpu-2gb, s-2vcpu-2gb, ... |
| image | Container Linux image for instances | "coreos-stable" | coreos-stable, coreos-beta, coreos-alpha |
| controller_clc_snippets | Controller Container Linux Config snippets | [] | |
| worker_clc_snippets | Worker Container Linux Config snippets | [] | |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by kube-dns. | "cluster.local" | "k8s.example.com" |
Check the list of valid [droplet types](https://developers.digitalocean.com/documentation/changelog/api-v2/new-size-slugs-for-droplet-plan-changes/) or use `doctl compute size list`.
!!! warning
Do not choose a `controller_type` smaller than 2GB. Smaller droplets are not sufficient for running a controller and bootstrapping will fail.
!!! bug
Digital Ocean firewalls do not yet support the IP tunneling (IP in IP) protocol used by Calico. You can try using "calico" for `networking`, but it will only work if the cloud firewall is removed (unsafe).

View File

@ -1,6 +1,6 @@
# Google Cloud
In this tutorial, we'll create a Kubernetes v1.9.5 cluster on Google Compute Engine (not GKE).
In this tutorial, we'll create a Kubernetes v1.10.1 cluster on Google Compute Engine (not GKE).
We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, a network, firewall rules, managed instance groups of Kubernetes controllers and workers, network load balancers for controllers and workers, and health checks will be created.
@ -57,7 +57,7 @@ Configure the Google Cloud provider to use your service account key, project-id,
```tf
provider "google" {
version = "1.2"
version = "1.6"
alias = "default"
credentials = "${file("~/.config/google-cloud/terraform.json")}"
@ -97,7 +97,7 @@ Define a Kubernetes cluster using the module `google-cloud/container-linux/kuber
```tf
module "google-cloud-yavin" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.9.5"
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.10.1"
providers = {
google = "google.default"
@ -108,18 +108,17 @@ module "google-cloud-yavin" {
}
# Google Cloud
cluster_name = "yavin"
region = "us-central1"
dns_zone = "example.com"
dns_zone_name = "example-zone"
os_image = "coreos-stable"
cluster_name = "yavin"
controller_count = 1
worker_count = 2
# configuration
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
# output assets dir
asset_dir = "/home/user/.secrets/clusters/yavin"
asset_dir = "/home/user/.secrets/clusters/yavin"
# optional
worker_count = 2
}
```
@ -151,7 +150,7 @@ Get or update Terraform modules.
$ terraform get # downloads missing modules
$ terraform get --update # updates all modules
Get: git::https://github.com/poseidon/typhoon (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.11.0 (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.12.0 (update)
```
Plan the resources to be created.
@ -185,9 +184,9 @@ In 4-8 minutes, the Kubernetes cluster will be ready.
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
yavin-controller-0.c.example-com.internal Ready 6m v1.9.5
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.9.5
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.9.5
yavin-controller-0.c.example-com.internal Ready 6m v1.10.1
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.10.1
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.10.1
```
List the pods.
@ -227,7 +226,7 @@ Learn about [maintenance](topics/maintenance.md) and [addons](addons/overview.md
| region | Google Cloud region | "us-central1" |
| dns_zone | Google Cloud DNS zone | "google-cloud.example.com" |
| dns_zone_name | Google Cloud DNS zone name | "example-zone" |
| ssh_authorized_key | SSH public key for ~/.ssh_authorized_keys | "ssh-rsa AAAAB3NZ..." |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." |
| asset_dir | Path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/yavin" |
Check the list of valid [regions](https://cloud.google.com/compute/docs/regions-zones/regions-zones) and list Container Linux [images](https://cloud.google.com/compute/docs/images) with `gcloud compute images list | grep coreos`.
@ -255,15 +254,17 @@ resource "google_dns_managed_zone" "zone-for-clusters" {
|:-----|:------------|:--------|:--------|
| controller_count | Number of controllers (i.e. masters) | 1 | 1 |
| worker_count | Number of workers | 1 | 3 |
| machine_type | Machine type for compute instances | "n1-standard-1" | See below |
| os_image | OS image for compute instances | "coreos-stable" | "coreos-stable-1632-3-0-v20180215" |
| controller_type | Machine type for controllers | "n1-standard-1" | See below |
| worker_type | Machine type for workers | "n1-standard-1" | See below |
| os_image | Container Linux image for compute instances | "coreos-stable" | "coreos-stable-1632-3-0-v20180215" |
| disk_size | Size of the disk in GB | 40 | 100 |
| worker_preemptible | If enabled, Compute Engine will terminate workers randomly within 24 hours | false | true |
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
| pod_cidr | CIDR range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by kube-dns. | "cluster.local" | "k8s.example.com" |
| controller_clc_snippets | Controller Container Linux Config snippets | [] | |
| worker_clc_snippets | Worker Container Linux Config snippets | [] | |
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by kube-dns. | "cluster.local" | "k8s.example.com" |
Check the list of valid [machine types](https://cloud.google.com/compute/docs/machine-types).

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.9.5 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Kubernetes v1.10.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/) and [preemption](https://typhoon.psdn.io/google-cloud/#preemption) (varies by platform)
@ -44,29 +44,28 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo
```tf
module "google-cloud-yavin" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.9.5"
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.10.1"
providers = {
google = "google.default"
local = "local.default"
null = "null.default"
google = "google.default"
local = "local.default"
null = "null.default"
template = "template.default"
tls = "tls.default"
tls = "tls.default"
}
# Google Cloud
cluster_name = "yavin"
region = "us-central1"
dns_zone = "example.com"
dns_zone_name = "example-zone"
os_image = "coreos-stable"
cluster_name = "yavin"
controller_count = 1
worker_count = 2
# configuration
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
# output assets dir
asset_dir = "/home/user/.secrets/clusters/yavin"
asset_dir = "/home/user/.secrets/clusters/yavin"
# optional
worker_count = 2
}
```
@ -86,9 +85,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
yavin-controller-0.c.example-com.internal Ready 6m v1.9.5
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.9.5
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.9.5
yavin-controller-0.c.example-com.internal Ready 6m v1.10.1
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.10.1
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.10.1
```
List the pods.

View File

@ -162,14 +162,14 @@ show ip bgp neighbors
show ip route bgp
```
Be sure to register the peer by creating a Calico `bgpPeer` CRD with `kubectl apply`.
Be sure to register the peer by creating a Calico `BGPPeer` CRD with `kubectl apply`.
```
apiVersion: v1
kind: bgpPeer
apiVersion: crd.projectcalico.org/v1
kind: BGPPeer
metadata:
peerIP: LAN_IP
scope: global
name: NAME
spec:
peerIP: LAN_IP
asNumber: 64512
```

View File

@ -18,7 +18,7 @@ module "google-cloud-yavin" {
}
module "bare-metal-mercury" {
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.9.5"
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.10.1"
...
}
```

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.9.5 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Kubernetes v1.10.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)

View File

@ -17,7 +17,7 @@ resource "google_dns_record_set" "controllers" {
rrdatas = ["${google_compute_address.controllers-ip.address}"]
}
# Network Load Balancer (i.e. forwarding rule)
# Network Load Balancer for controllers
resource "google_compute_forwarding_rule" "controller-https-rule" {
name = "${var.cluster_name}-controller-https-rule"
ip_address = "${google_compute_address.controllers-ip.address}"

View File

@ -1,10 +1,10 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=457b596fa06b6752f25ed320337dcbedcce7f0fb"
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=db36b92abced3c4b0af279adfd5ed4bf0cf8c39f"
cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
etcd_servers = "${module.controllers.etcd_fqdns}"
etcd_servers = ["${null_resource.repeat.*.triggers.domain}"]
asset_dir = "${var.asset_dir}"
networking = "${var.networking}"
network_mtu = 1440

View File

@ -7,12 +7,13 @@ systemd:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.3.2"
Environment="ETCD_IMAGE_TAG=v3.3.3"
Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
Environment="ETCD_LISTEN_CLIENT_URLS=https://0.0.0.0:2379"
Environment="ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380"
Environment="ETCD_LISTEN_METRICS_URLS=http://0.0.0.0:2381"
Environment="ETCD_INITIAL_CLUSTER=${etcd_initial_cluster}"
Environment="ETCD_STRICT_RECONFIG_CHECK=true"
Environment="ETCD_SSL_DIR=/etc/ssl/etcd"
@ -117,8 +118,8 @@ storage:
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.5
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.10.1
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -139,7 +140,7 @@ storage:
# Move experimental manifests
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.11.0}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.12.0}"
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
exec /usr/bin/rkt run \
--trust-keys-from-https \

View File

@ -1,42 +0,0 @@
module "controllers" {
source = "controllers"
cluster_name = "${var.cluster_name}"
# GCE
region = "${var.region}"
network = "${google_compute_network.network.name}"
dns_zone = "${var.dns_zone}"
dns_zone_name = "${var.dns_zone_name}"
count = "${var.controller_count}"
machine_type = "${var.machine_type}"
os_image = "${var.os_image}"
# configuration
networking = "${var.networking}"
kubeconfig = "${module.bootkube.kubeconfig}"
ssh_authorized_key = "${var.ssh_authorized_key}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
clc_snippets = "${var.controller_clc_snippets}"
}
module "workers" {
source = "workers"
name = "${var.cluster_name}"
cluster_name = "${var.cluster_name}"
# GCE
region = "${var.region}"
network = "${google_compute_network.network.name}"
count = "${var.worker_count}"
machine_type = "${var.machine_type}"
os_image = "${var.os_image}"
preemptible = "${var.worker_preemptible}"
# configuration
kubeconfig = "${module.bootkube.kubeconfig}"
ssh_authorized_key = "${var.ssh_authorized_key}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
clc_snippets = "${var.worker_clc_snippets}"
}

View File

@ -1,6 +1,6 @@
# Discrete DNS records for each controller's private IPv4 for etcd usage
resource "google_dns_record_set" "etcds" {
count = "${var.count}"
count = "${var.controller_count}"
# DNS Zone name where record should be created
managed_zone = "${var.dns_zone_name}"
@ -21,11 +21,11 @@ data "google_compute_zones" "all" {
# Controller instances
resource "google_compute_instance" "controllers" {
count = "${var.count}"
count = "${var.controller_count}"
name = "${var.cluster_name}-controller-${count.index}"
zone = "${element(data.google_compute_zones.all.names, count.index)}"
machine_type = "${var.machine_type}"
machine_type = "${var.controller_type}"
metadata {
user-data = "${element(data.ct_config.controller_ign.*.rendered, count.index)}"
@ -41,7 +41,7 @@ resource "google_compute_instance" "controllers" {
}
network_interface {
network = "${var.network}"
network = "${google_compute_network.network.name}"
# Ephemeral external IP
access_config = {}
@ -51,9 +51,13 @@ resource "google_compute_instance" "controllers" {
tags = ["${var.cluster_name}-controller"]
}
locals {
controllers_ipv4_public = ["${google_compute_instance.controllers.*.network_interface.0.access_config.0.assigned_nat_ip}"]
}
# Controller Container Linux Config
data "template_file" "controller_config" {
count = "${var.count}"
count = "${var.controller_count}"
template = "${file("${path.module}/cl/controller.yaml.tmpl")}"
@ -65,17 +69,17 @@ data "template_file" "controller_config" {
# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...
etcd_initial_cluster = "${join(",", formatlist("%s=https://%s:2380", null_resource.repeat.*.triggers.name, null_resource.repeat.*.triggers.domain))}"
kubeconfig = "${indent(10, module.bootkube.kubeconfig)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
ssh_authorized_key = "${var.ssh_authorized_key}"
kubeconfig = "${indent(10, var.kubeconfig)}"
}
}
# Horrible hack to generate a Terraform list of a desired length without dependencies.
# Ideal ${repeat("etcd", 3) -> ["etcd", "etcd", "etcd"]}
resource null_resource "repeat" {
count = "${var.count}"
count = "${var.controller_count}"
triggers {
name = "etcd${count.index}"
@ -84,8 +88,8 @@ resource null_resource "repeat" {
}
data "ct_config" "controller_ign" {
count = "${var.count}"
count = "${var.controller_count}"
content = "${element(data.template_file.controller_config.*.rendered, count.index)}"
pretty_print = false
snippets = ["${var.clc_snippets}"]
snippets = ["${var.controller_clc_snippets}"]
}

View File

@ -1,7 +0,0 @@
output "etcd_fqdns" {
value = ["${null_resource.repeat.*.triggers.domain}"]
}
output "ipv4_public" {
value = ["${google_compute_instance.controllers.*.network_interface.0.access_config.0.assigned_nat_ip}"]
}

View File

@ -1,87 +0,0 @@
variable "cluster_name" {
type = "string"
description = "Unique cluster name"
}
variable "region" {
type = "string"
description = "Google Cloud region (e.g. us-central1, see `gcloud compute regions list`)."
}
variable "network" {
type = "string"
description = "Name of the network to attach to the compute instance interfaces"
}
variable "dns_zone" {
type = "string"
description = "Google Cloud DNS Zone value to create etcd/k8s subdomains (e.g. dghubble.io)"
}
variable "dns_zone_name" {
type = "string"
description = "Google Cloud DNS Zone name to create etcd/k8s subdomains (e.g. dghubble-io)"
}
# instances
variable "count" {
type = "string"
description = "Number of controller compute instances the instance group should manage"
}
variable "machine_type" {
type = "string"
description = "Machine type for compute instances (e.g. gcloud compute machine-types list)"
}
variable "os_image" {
type = "string"
description = "OS image from which to initialize the disk (e.g. gcloud compute images list)"
}
variable "disk_size" {
type = "string"
default = "40"
description = "The size of the disk in gigabytes."
}
# configuration
variable "networking" {
description = "Choice of networking provider (flannel or calico)"
type = "string"
default = "calico"
}
variable "kubeconfig" {
type = "string"
description = "Generated Kubelet kubeconfig"
}
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key for logging in as user 'core'"
}
variable "service_cidr" {
description = <<EOD
CIDR IP range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
EOD
type = "string"
default = "10.3.0.0/16"
}
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by kube-dns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = "string"
default = "cluster.local"
}
variable "clc_snippets" {
type = "list"
description = "Container Linux Config snippets"
default = []
}

View File

@ -56,6 +56,20 @@ resource "google_compute_firewall" "internal-etcd" {
target_tags = ["${var.cluster_name}-controller"]
}
# Allow Prometheus to scrape etcd metrics
resource "google_compute_firewall" "internal-etcd-metrics" {
name = "${var.cluster_name}-internal-etcd-metrics"
network = "${google_compute_network.network.name}"
allow {
protocol = "tcp"
ports = [2381]
}
source_tags = ["${var.cluster_name}-worker"]
target_tags = ["${var.cluster_name}-controller"]
}
# Calico BGP and IPIP
# https://docs.projectcalico.org/v2.5/reference/public-cloud/gce
resource "google_compute_firewall" "internal-calico" {
@ -93,7 +107,7 @@ resource "google_compute_firewall" "internal-flannel" {
target_tags = ["${var.cluster_name}-controller", "${var.cluster_name}-worker"]
}
# Allow prometheus (workload) to scrape node-exporter daemonset
# Allow Prometheus to scrape node-exporter daemonset
resource "google_compute_firewall" "internal-node-exporter" {
name = "${var.cluster_name}-internal-node-exporter"
network = "${google_compute_network.network.name}"

View File

@ -1,19 +1,22 @@
# Deprecated
output "controllers_ipv4_public" {
value = ["${module.controllers.ipv4_public}"]
value = ["${google_compute_instance.controllers.*.network_interface.0.access_config.0.assigned_nat_ip}"]
}
output "ingress_static_ip" {
value = "${module.workers.ingress_static_ip}"
}
output "network_name" {
value = "${google_compute_network.network.name}"
}
output "network_self_link" {
value = "${google_compute_network.network.self_link}"
}
# Outputs for worker pools
output "network_name" {
value = "${google_compute_network.network.name}"
}
output "kubeconfig" {
value = "${module.bootkube.kubeconfig}"
}

View File

@ -1,20 +1,14 @@
# Secure copy etcd TLS assets and kubeconfig to controllers. Activates kubelet.service
resource "null_resource" "copy-secrets" {
depends_on = ["module.bootkube"]
count = "${var.controller_count}"
# Secure copy etcd TLS assets to controllers.
resource "null_resource" "copy-controller-secrets" {
count = "${var.controller_count}"
connection {
type = "ssh"
host = "${element(module.controllers.ipv4_public, count.index)}"
host = "${element(local.controllers_ipv4_public, count.index)}"
user = "core"
timeout = "15m"
}
provisioner "file" {
content = "${module.bootkube.kubeconfig}"
destination = "$HOME/kubeconfig"
}
provisioner "file" {
content = "${module.bootkube.etcd_ca_cert}"
destination = "$HOME/etcd-client-ca.crt"
@ -62,7 +56,6 @@ resource "null_resource" "copy-secrets" {
"sudo mv etcd-peer.key /etc/ssl/etcd/etcd/peer.key",
"sudo chown -R etcd:etcd /etc/ssl/etcd",
"sudo chmod -R 500 /etc/ssl/etcd",
"sudo mv /home/core/kubeconfig /etc/kubernetes/kubeconfig",
]
}
}
@ -70,11 +63,16 @@ resource "null_resource" "copy-secrets" {
# Secure copy bootkube assets to ONE controller and start bootkube to perform
# one-time self-hosted cluster bootstrapping.
resource "null_resource" "bootkube-start" {
depends_on = ["module.controllers", "module.bootkube", "module.workers", "null_resource.copy-secrets"]
depends_on = [
"module.bootkube",
"module.workers",
"google_dns_record_set.controllers",
"null_resource.copy-controller-secrets",
]
connection {
type = "ssh"
host = "${element(module.controllers.ipv4_public, 0)}"
host = "${element(local.controllers_ipv4_public, 0)}"
user = "core"
timeout = "15m"
}
@ -86,7 +84,7 @@ resource "null_resource" "bootkube-start" {
provisioner "remote-exec" {
inline = [
"sudo mv /home/core/assets /opt/bootkube",
"sudo mv $HOME/assets /opt/bootkube",
"sudo systemctl start bootkube",
]
}

View File

@ -1,8 +1,10 @@
variable "cluster_name" {
type = "string"
description = "Cluster name"
description = "Unique cluster name (prepended to dns_zone)"
}
# Google Cloud
variable "region" {
type = "string"
description = "Google Cloud Region (e.g. us-central1, see `gcloud compute regions list`)"
@ -10,35 +12,20 @@ variable "region" {
variable "dns_zone" {
type = "string"
description = "Google Cloud DNS Zone (e.g. google-cloud.dghubble.io)"
description = "Google Cloud DNS Zone (e.g. google-cloud.example.com)"
}
variable "dns_zone_name" {
type = "string"
description = "Google Cloud DNS Zone name (e.g. google-cloud-prod-zone)"
description = "Google Cloud DNS Zone name (e.g. example-zone)"
}
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key for user 'core'"
}
variable "machine_type" {
type = "string"
default = "n1-standard-1"
description = "Machine type for compute instances (see `gcloud compute machine-types list`)"
}
variable "os_image" {
type = "string"
default = "coreos-stable"
description = "OS image from which to initialize the disk (see `gcloud compute images list`)"
}
# instances
variable "controller_count" {
type = "string"
default = "1"
description = "Number of controllers"
description = "Number of controllers (i.e. masters)"
}
variable "worker_count" {
@ -47,6 +34,30 @@ variable "worker_count" {
description = "Number of workers"
}
variable "controller_type" {
type = "string"
default = "n1-standard-1"
description = "Machine type for controllers (see `gcloud compute machine-types list`)"
}
variable "worker_type" {
type = "string"
default = "n1-standard-1"
description = "Machine type for controllers (see `gcloud compute machine-types list`)"
}
variable "os_image" {
type = "string"
default = "coreos-stable"
description = "Container Linux image for compute instances (e.g. coreos-stable)"
}
variable "disk_size" {
type = "string"
default = "40"
description = "Size of the disk in GB"
}
variable "worker_preemptible" {
type = "string"
default = "false"
@ -65,7 +76,12 @@ variable "worker_clc_snippets" {
default = []
}
# bootkube assets
# configuration
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key for user 'core'"
}
variable "asset_dir" {
description = "Path to a directory where generated assets should be placed (contains secrets)"
@ -79,14 +95,14 @@ variable "networking" {
}
variable "pod_cidr" {
description = "CIDR IP range to assign Kubernetes pods"
description = "CIDR IPv4 range to assign Kubernetes pods"
type = "string"
default = "10.2.0.0/16"
}
variable "service_cidr" {
description = <<EOD
CIDR IP range to assign Kubernetes services.
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
EOD

View File

@ -0,0 +1,21 @@
module "workers" {
source = "workers"
name = "${var.cluster_name}"
cluster_name = "${var.cluster_name}"
# GCE
region = "${var.region}"
network = "${google_compute_network.network.name}"
count = "${var.worker_count}"
machine_type = "${var.worker_type}"
os_image = "${var.os_image}"
disk_size = "${var.disk_size}"
preemptible = "${var.worker_preemptible}"
# configuration
kubeconfig = "${module.bootkube.kubeconfig}"
ssh_authorized_key = "${var.ssh_authorized_key}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
clc_snippets = "${var.worker_clc_snippets}"
}

View File

@ -40,8 +40,6 @@ systemd:
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
@ -90,8 +88,8 @@ storage:
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.5
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.10.1
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -109,7 +107,7 @@ storage:
--volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \
--insecure-options=image \
docker://gcr.io/google_containers/hyperkube:v1.9.5 \
docker://k8s.gcr.io/hyperkube:v1.10.1 \
--net=host \
--dns=host \
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)

View File

@ -1,21 +1,23 @@
variable "name" {
type = "string"
description = "Unique name for instance group"
description = "Unique name for the worker pool"
}
variable "cluster_name" {
type = "string"
description = "Cluster name"
description = "Must be set to `cluster_name of cluster`"
}
# Google Cloud
variable "region" {
type = "string"
description = "Google Cloud region (e.g. us-central1, see `gcloud compute regions list`)."
description = "Must be set to `region` of cluster"
}
variable "network" {
type = "string"
description = "Name of the network to attach to the compute instance interfaces"
description = "Must be set to `network_name` output by cluster"
}
# instances
@ -35,13 +37,13 @@ variable "machine_type" {
variable "os_image" {
type = "string"
default = "coreos-stable"
description = "OS image from which to initialize the disk (e.g. gcloud compute images list)"
description = "Container Linux image for compute instanges (e.g. gcloud compute images list)"
}
variable "disk_size" {
type = "string"
default = "40"
description = "The size of the disk in gigabytes."
description = "Size of the disk in GB"
}
variable "preemptible" {
@ -54,7 +56,7 @@ variable "preemptible" {
variable "kubeconfig" {
type = "string"
description = "Generated Kubelet kubeconfig"
description = "Must be set to `kubeconfig` output by cluster"
}
variable "ssh_authorized_key" {
@ -64,7 +66,7 @@ variable "ssh_authorized_key" {
variable "service_cidr" {
description = <<EOD
CIDR IP range to assign Kubernetes services.
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
EOD

View File

@ -1,5 +1,5 @@
mkdocs==0.17.2
mkdocs-material==2.2.6
mkdocs==0.17.3
mkdocs-material==2.7.1
pygments==2.2.0
pymdown-extensions==3.5
six==1.10.0