Compare commits

..

41 Commits

Author SHA1 Message Date
77c0a4cf2e Update Kubernetes from v1.10.0 to v1.10.1
* Use kubernetes-incubator/bootkube v0.12.0
2018-04-12 20:57:31 -07:00
5035d56db2 Refactor GCP to remove controller internal module
* Remove the controller internal module to align with
other platforms and since its not a supported use case
2018-04-12 19:41:51 -07:00
9bb3de5327 Skip creating unused dirs on worker nodes 2018-04-11 22:23:51 -07:00
c8eabc2af4 Fix GCP controller_type and worker_type vars 2018-04-11 22:19:58 -07:00
2eaf858c5c Update example BGPPeer manifest
Previous example may have been outdated. It resulted in `error: unable to recognize "example.yaml": no matches for /, Kind=bgpPeer` .

See https://docs.projectcalico.org/v3.0/reference/calicoctl/resources/bgppeer.
2018-04-09 23:23:18 -05:00
b8656fd74b Clarify bare-metal SSH instructions 2018-04-08 14:11:05 -07:00
d276fffcda Fix bare-metal multiple apply/ssh on Terraform v0.11.4+
* Terraform v0.11.4 introduced changes to remote-exec
that mean Typhoon bare-metal clusters require multiple
runs of terraform apply to ssh and bootstrap.
* Bare-metal installs PXE boot a live instance to install
to disk and then reboot from disk as controllers/workers.
Terraform remote-exec has no way to "know" to wait until
the reboot has occurred to kickoff Kubernetes bootstrap.
Previously Typhoon created a "debug" user during this
install phase to allow an admin to SSH, but remote-exec
would hang, trying to connect as user "core". Terraform
v0.11.4 changes this behavior so remote-exec fails and
a user must re-run terraform apply until succeeding.
* A new way to "trick" remote-exec into waiting for the
reboot into the disk install is to run SSH on a non-standard
port during the disk install. This retains the ability
for an admin to SSH during install (most distros don't have
this) and fixes the issue so only a single run of terraform
apply is needed.
* https://github.com/hashicorp/terraform/pull/17359#issuecomment-376415464
2018-04-08 13:32:31 -07:00
6b08bde479 Use k8s.gcr.io instead of gcr.io/google_containers
* Kubernetes recommends using the alias to fetch images
from the nearest GCR regional mirror, to abstract the use
of GCR, and to drop names containing 'google'
* https://groups.google.com/forum/#!msg/kubernetes-dev/ytjk_rNrTa0/3EFUHvovCAAJ
2018-04-08 12:57:52 -07:00
f4b2396718 Return Prometheus deployment to be a worker workload
* Expose etcd metrics to workers so Prometheus can
run on a worker, rather than a controller
* Drop temporary firewall rules allowing Prometheus
to run on a controller and scrape targes
* Related to https://github.com/poseidon/typhoon/pull/175
2018-04-08 12:20:00 -07:00
b76126db93 Update docs builder and material theme 2018-04-08 00:00:03 -07:00
7186aa46da Update kube-state-metrics from v1.2.0 to v1.3.0
* https://github.com/kubernetes/kube-state-metrics/pull/412
* https://github.com/kubernetes/kube-state-metrics/pull/413
2018-04-04 21:04:13 -07:00
18dbaf74ce Update kube-dns from v1.14.8 to v1.14.9
* https://github.com/kubernetes/kubernetes/pull/61908
2018-04-04 21:00:23 -07:00
ce001e9d56 Update etcd from v3.3.2 to v3.3.3
* https://github.com/coreos/etcd/releases/tag/v3.3.3
2018-04-04 20:32:24 -07:00
d770393dbc Add etcd metrics, Prometheus scrapes, and Grafana dash
* Use etcd v3.3 --listen-metrics-urls to expose only metrics
data via http://0.0.0.0:2381 on controllers
* Add Prometheus discovery for etcd peers on controller nodes
* Temporarily drop two noisy Prometheus alerts
2018-04-03 20:31:00 -07:00
642f7ec22f Update CHANGES.md with Kubernetes link 2018-03-30 23:12:38 -07:00
1cc043d1eb Update Kubernetes from v1.9.6 to v1.10.0 2018-03-30 22:14:07 -07:00
f8e9bfb1c0 Add disk_type variable for EBS volume type on AWS
* Change EBS volume type from `standard` ("prior generation)
 to `gp2`. Prometheus alerts are tuned for SSDs
* Other platforms have fast enough disks by default
2018-03-29 22:51:54 -07:00
b1e41dcb99 addons: Update from Grafana v4.6.3 to v5.0.4
This reverts commit c59a9c66b1.
2018-03-28 19:45:19 -07:00
de4d90750e Use consistent naming of remote provision steps 2018-03-26 00:29:57 -07:00
7acd4931f6 Remove redundant kubeconfig copy on AWS and GCP
* AWS and Google Cloud make use of auto-scaling groups
and managed instance groups, respectively. As such, the
kubeconfig is already held in cloud user-data
* Controller instances are provisioned with a kubeconfig
from user-data. Its redundant to use a Terraform remote
file copy step for the kubeconfig.
2018-03-26 00:01:47 -07:00
cfd603bea2 Ensure etcd secrets are only distributed to controller hosts
* Previously, etcd secrets were erroneously distributed to worker
nodes (permissions 500, ownership etc:etcd).
2018-03-25 23:46:44 -07:00
fdb543e834 Add optional controller_type and worker_type vars on GCP
* Remove optional machine_type variable on Google Cloud
* Use controller_type and worker_type instead
2018-03-25 22:11:18 -07:00
8d3d4220fd Add disk_size variable on Google Cloud 2018-03-25 22:04:14 -07:00
ba9daf439e Remove unmaintained pxe-worker internal module 2018-03-25 21:57:52 -07:00
38adb14bd2 Remove optional variable networking on Digital Ocean
* Calico isn't viable on Digital Ocean because their firewalls
do not support IP-IP protocol. Its not viable to run a cluster
without firewalls just to use Calico.
* Remove the caveat note. Don't allow users to shoot themselves
in the foot
2018-03-25 21:48:51 -07:00
e43cf9f608 Organize and cleanup variable descriptions 2018-03-25 21:44:43 -07:00
455a4af27e Improve cluster definition examples in docs 2018-03-25 20:41:52 -07:00
39876e455f Fix docs to reflect enforced provider versions 2018-03-25 11:34:39 -07:00
da2be86e8c Add v1.9.6 heading to CHANGES.md 2018-03-22 22:01:29 -07:00
65a2751f77 addons: Update heapster from v1.5.1 to v1.5.2
* https://github.com/kubernetes/heapster/releases/tag/v1.5.2
2018-03-21 20:32:01 -07:00
a04ef3919a Update Kubernetes from v1.9.5 to v1.9.6 2018-03-21 20:29:52 -07:00
851bc1a3f8 Update nginx-ingress from 0.11.0 to 0.12.0 2018-03-19 23:17:17 -07:00
758c09fa5c Update Kubernetes from v1.9.4 to v1.9.5 2018-03-19 00:25:44 -07:00
b1cdd361ef Mention controllers node label in changelog 2018-03-19 00:15:56 -07:00
7f7bc960a6 Set default Google Cloud os_image to coreos-stable 2018-03-19 00:08:26 -07:00
29108fd99d Improve changelog with migration links 2018-03-18 23:54:55 -07:00
18d08de898 Add Container Linux Config snippet docs 2018-03-18 23:22:40 -07:00
f3730b2bfa Add Container Linux Config snippets feature
* Introduce the ability to support Container Linux Config
"snippets" for controllers and workers on cloud platforms.
This allows end-users to customize hosts by providing Container
Linux configs that are additively merged into the base configs
defined by Typhoon. Config snippets are validated, merged, and
show any errors during `terraform plan`
* Example uses include adding systemd units, network configs,
mounts, files, raid arrays, or other disk provisioning features
provided by Container Linux Configs (using Ignition low-level)
* Requires terraform-provider-ct v0.2.1 plugin
2018-03-18 18:28:18 -07:00
88aa9a46e5 Add /var/lib/calico volume mount to Calico DaemonSet 2018-03-18 16:40:38 -07:00
efa90d8b44 Add a new key=value label to controller nodes
* Add a node-role.kubernetes.io/controller="true" node label
to controllers so Prometheus service discovery can filter to
services that only run on controllers (i.e. masters)
* Leave node-role.kubernetes.io/master="" untouched as its
a Kubernetes convention
2018-03-18 16:39:10 -07:00
46226a8015 Update Prometheus from 2.2.0 to 2.2.1 2018-03-18 15:56:44 -07:00
77 changed files with 936 additions and 1034 deletions

View File

@ -4,6 +4,95 @@ Notable changes between versions.
## Latest
* Kubernetes [v1.10.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.10.md#v1101)
* Enable etcd v3.3 metrics endpoint ([#175](https://github.com/poseidon/typhoon/pull/175))
* Use `k8s.gcr.io` instead of `gcr.io/google_containers` ([#180](https://github.com/poseidon/typhoon/pull/180))
* Kubernetes [recommends](https://groups.google.com/forum/#!msg/kubernetes-dev/ytjk_rNrTa0/3EFUHvovCAAJ) using the alias to pull from the nearest regional mirror and to abstract the backing container registry
* Update kube-dns from v1.14.8 to v1.14.9
* Update etcd from v3.3.2 to v3.3.3
* Use kubernetes-incubator/bootkube v0.12.0
#### Bare-Metal
* Fix need for multiple `terraform apply` runs to create a cluster with Terraform v0.11.4 ([#181](https://github.com/poseidon/typhoon/pull/181))
* To SSH during a disk install for debugging, SSH as user "core" with port 2222
* Remove the old trick of using a user "debug" during disk install
#### Google Cloud
* Refactor out the `controller` internal module
#### Addons
* Add Prometheus discovery for etcd peers on controller nodes ([#175](https://github.com/poseidon/typhoon/pull/175))
* Scrape etcd v3.3 `--listen-metrics-urls` for metrics
* Enable etcd alerts and populate the etcd Grafana dashboard
* Update kube-state-metrics from v1.2.0 to v1.3.0
## v1.10.0
* Kubernetes [v1.10.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.10.md#v1100)
* Remove unused, unmaintained `pxe-worker` internal module
#### AWS
* Add `disk_type` optional variable for setting the EBS volume type ([#176](https://github.com/poseidon/typhoon/pull/176))
* Change default type from `standard` to `gp2`. Prometheus etcd alerts are tuned for fast disks.
#### Digital Ocean
* Ensure etcd secrets are only distributed to controller hosts, not workers.
* Remove `networking` optional variable. Only flannel works on Digital Ocean.
#### Google Cloud
* Add `disk_size` optional variable for setting instance disk size in GB
* Add `controller_type` optional variable for setting machine type for controllers
* Add `worker_type` optional variable for setting machine type for workers
* Remove `machine_type` optional variable. Use `controller_type` and `worker_type`.
#### Addons
* Update Grafana from v4.6.3 to v5.0.4 ([#153](https://github.com/poseidon/typhoon/pull/153), [#174](https://github.com/poseidon/typhoon/pull/174))
* Restrict dashboard organization role to Viewer
## v1.9.6
* Kubernetes [v1.9.6](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.9.md#v196)
* Update Calico from v3.0.3 to v3.0.4
#### Addons
* Update heapster from v1.5.1 to v1.5.2
## v1.9.5
* Kubernetes [v1.9.5](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.9.md#v195)
* Fix `subPath` volume mounts regression ([kubernetes#61076](https://github.com/kubernetes/kubernetes/issues/61076))
* Introduce [Container Linux Config snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) on cloud platforms ([#145](https://github.com/poseidon/typhoon/pull/145))
* Validate and additively merge custom Container Linux Configs during `terraform plan`
* Define files, systemd units, dropins, networkd configs, mounts, users, and more
* Require updating `terraform-provider-ct` plugin from v0.2.0 to v0.2.1
* Add `node-role.kubernetes.io/controller="true"` node label to controllers ([#160](https://github.com/poseidon/typhoon/pull/160))
#### AWS
* [Require](https://typhoon.psdn.io/topics/maintenance/#terraform-provider-ct-v021) updating `terraform-provider-ct` plugin from v0.2.0 to [v0.2.1](https://github.com/coreos/terraform-provider-ct/releases/tag/v0.2.1) (action required!)
#### Digital Ocean
* [Require](https://typhoon.psdn.io/topics/maintenance/#terraform-provider-ct-v021) updating `terraform-provider-ct` plugin from v0.2.0 to [v0.2.1](https://github.com/coreos/terraform-provider-ct/releases/tag/v0.2.1) (action required!)
#### Google Cloud
* [Require](https://typhoon.psdn.io/topics/maintenance/#terraform-provider-ct-v021) updating `terraform-provider-ct` plugin from v0.2.0 to [v0.2.1](https://github.com/coreos/terraform-provider-ct/releases/tag/v0.2.1) (action required!)
* Relax `os_image` to optional. Default to "coreos-stable".
#### Addons
* Update nginx-ingress from 0.11.0 to 0.12.0
* Update Prometheus from 2.2.0 to 2.2.1
## v1.9.4
* Kubernetes [v1.9.4](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.9.md#v194)
@ -15,7 +104,7 @@ Notable changes between versions.
* Allow flexvolume plugins to be used on any Typhoon cluster (not just bare-metal)
* Upgrade etcd from v3.2.15 to v3.3.2
* Update Calico from v3.0.2 to v3.0.3
* Use kubernetes-incubator/bootkube v0.10.0
* Use kubernetes-incubator/bootkube v0.11.0
* [Recommend](https://typhoon.psdn.io/topics/maintenance/#terraform-provider-ct-v021) updating `terraform-provider-ct` plugin from v0.2.0 to [v0.2.1](https://github.com/coreos/terraform-provider-ct/releases/tag/v0.2.1) (action recommended)
#### AWS

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.9.4 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Kubernetes v1.10.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/) and [preemption](https://typhoon.psdn.io/google-cloud/#preemption) (varies by platform)
@ -44,29 +44,28 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo
```tf
module "google-cloud-yavin" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.9.4"
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.10.1"
providers = {
google = "google.default"
local = "local.default"
null = "null.default"
google = "google.default"
local = "local.default"
null = "null.default"
template = "template.default"
tls = "tls.default"
tls = "tls.default"
}
# Google Cloud
cluster_name = "yavin"
region = "us-central1"
dns_zone = "example.com"
dns_zone_name = "example-zone"
os_image = "coreos-stable"
cluster_name = "yavin"
controller_count = 1
worker_count = 2
# configuration
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
# output assets dir
asset_dir = "/home/user/.secrets/clusters/yavin"
asset_dir = "/home/user/.secrets/clusters/yavin"
# optional
worker_count = 2
}
```
@ -87,9 +86,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
yavin-controller-0.c.example-com.internal Ready 6m v1.9.4
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.9.4
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.9.4
yavin-controller-0.c.example-com.internal Ready 6m v1.10.1
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.10.1
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.10.1
```
List the pods.

View File

@ -0,0 +1,15 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-dashboard-providers
namespace: monitoring
data:
dashboard-providers.yaml: |+
apiVersion: 1
providers:
- name: 'default'
ordId: 1
folder: ''
type: file
options:
path: /var/lib/grafana/dashboards

View File

@ -5,14 +5,12 @@ metadata:
namespace: monitoring
data:
deployment-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -39,7 +37,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -110,7 +108,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -181,7 +179,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "Bps",
"gauge": {
@ -262,7 +260,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -333,7 +331,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -403,7 +401,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -473,7 +471,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -550,7 +548,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -665,7 +663,7 @@ data:
{
"allValue": ".*",
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": false,
"label": "Namespace",
@ -685,7 +683,7 @@ data:
{
"allValue": null,
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": false,
"label": "Deployment",
@ -737,24 +735,11 @@ data:
"title": "Deployment",
"version": 1
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
etcd-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"name": "DS_PROMETHEUS",
"name": "prometheus",
"label": "prometheus",
"description": "",
"type": "datasource",
@ -813,7 +798,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"format": "none",
@ -889,7 +874,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 0,
@ -978,7 +963,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 0,
@ -1079,7 +1064,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"decimals": null,
"editable": false,
"error": false,
@ -1161,7 +1146,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 0,
@ -1250,7 +1235,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 0,
@ -1342,7 +1327,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 5,
@ -1422,7 +1407,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 5,
@ -1502,7 +1487,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 0,
@ -1582,7 +1567,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"decimals": null,
"editable": false,
"error": false,
@ -1676,7 +1661,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 0,
@ -1782,7 +1767,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"decimals": 0,
"editable": false,
"error": false,
@ -1909,26 +1894,13 @@ data:
"title": "etcd",
"version": 4
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
kubernetes-capacity-planning-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -1954,7 +1926,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -2032,7 +2004,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -2134,7 +2106,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -2250,7 +2222,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -2333,7 +2305,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -2440,7 +2412,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percentunit",
"gauge": {
@ -2522,7 +2494,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -2604,7 +2576,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -2695,7 +2667,7 @@ data:
"aliasColors": {},
"bars": false,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -2782,7 +2754,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -2897,26 +2869,13 @@ data:
"title": "Kubernetes Capacity Planning",
"version": 4
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
kubernetes-cluster-health-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -2944,7 +2903,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3025,7 +2984,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3101,7 +3060,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3177,7 +3136,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3263,7 +3222,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3339,7 +3298,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3415,7 +3374,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3491,7 +3450,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3605,26 +3564,13 @@ data:
"title": "Kubernetes Cluster Health",
"version": 9
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
kubernetes-cluster-status-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -3651,7 +3597,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3723,7 +3669,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -3805,7 +3751,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -3877,7 +3823,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -3949,7 +3895,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4021,7 +3967,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -4103,7 +4049,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4175,7 +4121,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4247,7 +4193,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4319,7 +4265,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4429,26 +4375,13 @@ data:
"title": "Kubernetes Cluster Status",
"version": 3
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
kubernetes-control-plane-status-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -4475,7 +4408,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4550,7 +4483,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4625,7 +4558,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4700,7 +4633,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -4783,7 +4716,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -4869,7 +4802,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -4944,7 +4877,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -5069,26 +5002,13 @@ data:
"title": "Kubernetes Control Plane Status",
"version": 3
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
kubernetes-resource-requests-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -5113,7 +5033,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"description": "This represents the total [CPU resource requests](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#meaning-of-cpu) in the cluster.\nFor comparison the total [allocatable CPU cores](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node-allocatable.md) is also shown.",
"editable": false,
"error": false,
@ -5202,7 +5122,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -5284,7 +5204,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"description": "This represents the total [memory resource requests](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#meaning-of-memory) in the cluster.\nFor comparison the total [allocatable memory](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node-allocatable.md) is also shown.",
"editable": false,
"error": false,
@ -5373,7 +5293,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -5486,26 +5406,13 @@ data:
"title": "Kubernetes Resource Requests",
"version": 2
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
nodes-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -5532,7 +5439,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -5611,7 +5518,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -5713,7 +5620,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -5825,7 +5732,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percent",
"gauge": {
@ -5907,7 +5814,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -6014,7 +5921,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(245, 54, 54, 0.9)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "percentunit",
"gauge": {
@ -6096,7 +6003,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -6178,7 +6085,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -6270,7 +6177,7 @@ data:
{
"allValue": null,
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": false,
"label": null,
@ -6322,26 +6229,13 @@ data:
"title": "Nodes",
"version": 2
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
pods-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -6366,7 +6260,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -6472,7 +6366,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -6576,7 +6470,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -6662,7 +6556,7 @@ data:
{
"allValue": ".*",
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": true,
"label": "Namespace",
@ -6682,7 +6576,7 @@ data:
{
"allValue": null,
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": false,
"label": "Pod",
@ -6702,7 +6596,7 @@ data:
{
"allValue": ".*",
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": true,
"label": "Container",
@ -6754,26 +6648,13 @@ data:
"title": "Pods",
"version": 1
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
statefulset-dashboard.json: |+
{
"dashboard":
{
"__inputs": [
{
"description": "",
"label": "prometheus",
"name": "DS_PROMETHEUS",
"name": "prometheus",
"pluginId": "prometheus",
"pluginName": "Prometheus",
"type": "datasource"
@ -6800,7 +6681,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -6871,7 +6752,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -6942,7 +6823,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "Bps",
"gauge": {
@ -7023,7 +6904,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -7094,7 +6975,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -7164,7 +7045,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -7234,7 +7115,7 @@ data:
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"format": "none",
"gauge": {
@ -7311,7 +7192,7 @@ data:
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"editable": false,
"error": false,
"fill": 1,
@ -7405,7 +7286,7 @@ data:
{
"allValue": ".*",
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": false,
"label": "Namespace",
@ -7425,7 +7306,7 @@ data:
{
"allValue": null,
"current": {},
"datasource": "${DS_PROMETHEUS}",
"datasource": "prometheus",
"hide": 0,
"includeAll": false,
"label": "StatefulSet",
@ -7477,23 +7358,4 @@ data:
"title": "StatefulSet",
"version": 1
}
,
"inputs": [
{
"name": "DS_PROMETHEUS",
"pluginId": "prometheus",
"type": "datasource",
"value": "prometheus"
}
],
"overwrite": true
}
prometheus-datasource.json: |+
{
"access": "proxy",
"basicAuth": false,
"name": "prometheus",
"type": "prometheus",
"url": "http://prometheus.monitoring.svc"
}
---

View File

@ -0,0 +1,16 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-datasources
namespace: monitoring
data:
prometheus.yaml: |+
apiVersion: 1
datasources:
- name: prometheus
type: prometheus
access: proxy
orgId: 1
url: http://prometheus.monitoring.svc.cluster.local
version: 1
editable: false

View File

@ -21,7 +21,7 @@ spec:
spec:
containers:
- name: grafana
image: grafana/grafana:4.6.3
image: grafana/grafana:5.0.4
env:
- name: GF_SERVER_HTTP_PORT
value: "8080"
@ -30,7 +30,7 @@ spec:
- name: GF_AUTH_ANONYMOUS_ENABLED
value: "true"
- name: GF_AUTH_ANONYMOUS_ORG_ROLE
value: Admin
value: Viewer
ports:
- name: http
containerPort: 8080
@ -41,22 +41,20 @@ spec:
limits:
memory: 200Mi
cpu: 200m
- name: grafana-watcher
image: quay.io/coreos/grafana-watcher:v0.0.8
args:
- '--watch-dir=/etc/grafana/dashboards'
- '--grafana-url=http://localhost:8080'
resources:
requests:
memory: "16Mi"
cpu: "50m"
limits:
memory: "32Mi"
cpu: "100m"
volumeMounts:
- name: dashboards
mountPath: /etc/grafana/dashboards
- name: datasources
mountPath: /etc/grafana/provisioning/datasources
- name: dashboard-providers
mountPath: /etc/grafana/provisioning/dashboards
- name: dashboards
mountPath: /var/lib/grafana/dashboards
volumes:
- name: datasources
configMap:
name: grafana-datasources
- name: dashboard-providers
configMap:
name: grafana-dashboard-providers
- name: dashboards
configMap:
name: grafana-dashboards

View File

@ -18,7 +18,7 @@ spec:
serviceAccountName: heapster
containers:
- name: heapster
image: k8s.gcr.io/heapster-amd64:v1.5.1
image: k8s.gcr.io/heapster-amd64:v1.5.2
command:
- /heapster
- --source=kubernetes.summary_api:''

View File

@ -20,7 +20,7 @@ spec:
# Any image is permissable as long as:
# 1. It serves a 404 page at /
# 2. It serves 200 on a /healthz endpoint
image: gcr.io/google_containers/defaultbackend:1.4
image: k8s.gcr.io/defaultbackend:1.4
ports:
- containerPort: 8080
resources:

View File

@ -23,7 +23,7 @@ spec:
hostNetwork: true
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.11.0
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.12.0
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-backend

View File

@ -23,7 +23,7 @@ spec:
hostNetwork: true
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.11.0
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.12.0
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-backend

View File

@ -20,7 +20,7 @@ spec:
# Any image is permissable as long as:
# 1. It serves a 404 page at /
# 2. It serves 200 on a /healthz endpoint
image: gcr.io/google_containers/defaultbackend:1.4
image: k8s.gcr.io/defaultbackend:1.4
ports:
- containerPort: 8080
resources:

View File

@ -20,7 +20,7 @@ spec:
# Any image is permissable as long as:
# 1. It serves a 404 page at /
# 2. It serves 200 on a /healthz endpoint
image: gcr.io/google_containers/defaultbackend:1.4
image: k8s.gcr.io/defaultbackend:1.4
ports:
- containerPort: 8080
resources:

View File

@ -23,7 +23,7 @@ spec:
hostNetwork: true
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.11.0
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.12.0
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-backend

View File

@ -112,6 +112,22 @@ data:
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
# Scrap etcd metrics from controllers
- job_name: 'etcd'
kubernetes_sd_configs:
- role: node
scheme: http
relabel_configs:
- source_labels: [__meta_kubernetes_node_label_node_role_kubernetes_io_controller]
action: keep
regex: 'true'
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- source_labels: [__meta_kubernetes_node_name]
action: replace
target_label: __address__
replacement: '${1}:2381'
# Scrape config for service endpoints.
#
# The relabeling allows the actual service scrape endpoint to be configured

View File

@ -18,7 +18,7 @@ spec:
serviceAccountName: prometheus
containers:
- name: prometheus
image: quay.io/prometheus/prometheus:v2.2.0
image: quay.io/prometheus/prometheus:v2.2.1
args:
- '--config.file=/etc/prometheus/prometheus.yaml'
ports:

View File

@ -5,6 +5,8 @@ metadata:
rules:
- apiGroups: [""]
resources:
- configmaps
- secrets
- nodes
- pods
- services

View File

@ -22,7 +22,7 @@ spec:
serviceAccountName: kube-state-metrics
containers:
- name: kube-state-metrics
image: quay.io/coreos/kube-state-metrics:v1.2.0
image: quay.io/coreos/kube-state-metrics:v1.3.0
ports:
- name: metrics
containerPort: 8080
@ -33,7 +33,7 @@ spec:
initialDelaySeconds: 5
timeoutSeconds: 5
- name: addon-resizer
image: gcr.io/google_containers/addon-resizer:1.7
image: k8s.gcr.io/addon-resizer:1.7
resources:
limits:
cpu: 100m

View File

@ -63,26 +63,6 @@ data:
description: etcd instance {{ $labels.instance }} has seen {{ $value }} leader
changes within the last hour
summary: a high number of leader changes within the etcd cluster are happening
- alert: HighNumberOfFailedGRPCRequests
expr: sum(rate(grpc_server_handled_total{grpc_code!="OK",job="etcd"}[5m])) BY (grpc_service, grpc_method)
/ sum(rate(grpc_server_handled_total{job="etcd"}[5m])) BY (grpc_service, grpc_method) > 0.01
for: 10m
labels:
severity: warning
annotations:
description: '{{ $value }}% of requests for {{ $labels.grpc_method }} failed
on etcd instance {{ $labels.instance }}'
summary: a high number of gRPC requests are failing
- alert: HighNumberOfFailedGRPCRequests
expr: sum(rate(grpc_server_handled_total{grpc_code!="OK",job="etcd"}[5m])) BY (grpc_service, grpc_method)
/ sum(rate(grpc_server_handled_total{job="etcd"}[5m])) BY (grpc_service, grpc_method) > 0.05
for: 5m
labels:
severity: critical
annotations:
description: '{{ $value }}% of requests for {{ $labels.grpc_method }} failed
on etcd instance {{ $labels.instance }}'
summary: a high number of gRPC requests are failing
- alert: GRPCRequestsSlow
expr: histogram_quantile(0.99, sum(rate(grpc_server_handling_seconds_bucket{job="etcd",grpc_type="unary"}[5m])) by (grpc_service, grpc_method, le))
> 0.15

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.9.4 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Kubernetes v1.10.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/)

View File

@ -1,6 +1,6 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=c5fc93d95fe4993511656cdd6372afbd1307f08f"
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=db36b92abced3c4b0af279adfd5ed4bf0cf8c39f"
cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]

View File

@ -7,12 +7,13 @@ systemd:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.3.2"
Environment="ETCD_IMAGE_TAG=v3.3.3"
Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
Environment="ETCD_LISTEN_CLIENT_URLS=https://0.0.0.0:2379"
Environment="ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380"
Environment="ETCD_LISTEN_METRICS_URLS=http://0.0.0.0:2381"
Environment="ETCD_INITIAL_CLUSTER=${etcd_initial_cluster}"
Environment="ETCD_STRICT_RECONFIG_CHECK=true"
Environment="ETCD_SSL_DIR=/etc/ssl/etcd"
@ -81,6 +82,7 @@ systemd:
--lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni \
--node-labels=node-role.kubernetes.io/master \
--node-labels=node-role.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
@ -115,8 +117,8 @@ storage:
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.4
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.10.1
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -137,7 +139,7 @@ storage:
# Move experimental manifests
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.11.0}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.12.0}"
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
exec /usr/bin/rkt run \
--trust-keys-from-https \

View File

@ -28,7 +28,7 @@ resource "aws_instance" "controllers" {
# storage
root_block_device {
volume_type = "standard"
volume_type = "${var.disk_type}"
volume_size = "${var.disk_size}"
}
@ -56,10 +56,10 @@ data "template_file" "controller_config" {
# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...
etcd_initial_cluster = "${join(",", formatlist("%s=https://%s:2380", null_resource.repeat.*.triggers.name, null_resource.repeat.*.triggers.domain))}"
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
kubeconfig = "${indent(10, module.bootkube.kubeconfig)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
}
}
@ -78,4 +78,5 @@ data "ct_config" "controller_ign" {
count = "${var.controller_count}"
content = "${element(data.template_file.controller_config.*.rendered, count.index)}"
pretty_print = false
snippets = ["${var.controller_clc_snippets}"]
}

View File

@ -51,6 +51,16 @@ resource "aws_security_group_rule" "controller-etcd" {
self = true
}
resource "aws_security_group_rule" "controller-etcd-metrics" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 2381
to_port = 2381
source_security_group_id = "${aws_security_group.worker.id}"
}
resource "aws_security_group_rule" "controller-flannel" {
security_group_id = "${aws_security_group.controller.id}"

View File

@ -1,5 +1,5 @@
# Secure copy etcd TLS assets and kubeconfig to controllers. Activates kubelet.service
resource "null_resource" "copy-secrets" {
# Secure copy etcd TLS assets to controllers.
resource "null_resource" "copy-controller-secrets" {
count = "${var.controller_count}"
connection {
@ -9,11 +9,6 @@ resource "null_resource" "copy-secrets" {
timeout = "15m"
}
provisioner "file" {
content = "${module.bootkube.kubeconfig}"
destination = "$HOME/kubeconfig"
}
provisioner "file" {
content = "${module.bootkube.etcd_ca_cert}"
destination = "$HOME/etcd-client-ca.crt"
@ -61,7 +56,6 @@ resource "null_resource" "copy-secrets" {
"sudo mv etcd-peer.key /etc/ssl/etcd/etcd/peer.key",
"sudo chown -R etcd:etcd /etc/ssl/etcd",
"sudo chmod -R 500 /etc/ssl/etcd",
"sudo mv /home/core/kubeconfig /etc/kubernetes/kubeconfig",
]
}
}
@ -69,7 +63,12 @@ resource "null_resource" "copy-secrets" {
# Secure copy bootkube assets to ONE controller and start bootkube to perform
# one-time self-hosted cluster bootstrapping.
resource "null_resource" "bootkube-start" {
depends_on = ["module.bootkube", "null_resource.copy-secrets", "aws_route53_record.apiserver"]
depends_on = [
"module.bootkube",
"module.workers",
"aws_route53_record.apiserver",
"null_resource.copy-controller-secrets",
]
connection {
type = "ssh"
@ -85,7 +84,7 @@ resource "null_resource" "bootkube-start" {
provisioner "remote-exec" {
inline = [
"sudo mv /home/core/assets /opt/bootkube",
"sudo mv $HOME/assets /opt/bootkube",
"sudo systemctl start bootkube",
]
}

View File

@ -1,21 +1,44 @@
variable "cluster_name" {
type = "string"
description = "Cluster name"
description = "Unique cluster name (prepended to dns_zone)"
}
# AWS
variable "dns_zone" {
type = "string"
description = "AWS DNS Zone (e.g. aws.dghubble.io)"
description = "AWS Route53 DNS Zone (e.g. aws.example.com)"
}
variable "dns_zone_id" {
type = "string"
description = "AWS DNS Zone ID (e.g. Z3PAABBCFAKEC0)"
description = "AWS Route53 DNS Zone ID (e.g. Z3PAABBCFAKEC0)"
}
variable "ssh_authorized_key" {
# instances
variable "controller_count" {
type = "string"
description = "SSH public key for user 'core'"
default = "1"
description = "Number of controllers (i.e. masters)"
}
variable "worker_count" {
type = "string"
default = "1"
description = "Number of workers"
}
variable "controller_type" {
type = "string"
default = "t2.small"
description = "EC2 instance type for controllers"
}
variable "worker_type" {
type = "string"
default = "t2.small"
description = "EC2 instance type for workers"
}
variable "os_channel" {
@ -27,41 +50,34 @@ variable "os_channel" {
variable "disk_size" {
type = "string"
default = "40"
description = "The size of the disk in Gigabytes"
description = "Size of the EBS volume in GB"
}
variable "host_cidr" {
description = "CIDR IPv4 range to assign to EC2 nodes"
variable "disk_type" {
type = "string"
default = "10.0.0.0/16"
default = "gp2"
description = "Type of the EBS volume (e.g. standard, gp2, io1)"
}
variable "controller_count" {
variable "controller_clc_snippets" {
type = "list"
description = "Controller Container Linux Config snippets"
default = []
}
variable "worker_clc_snippets" {
type = "list"
description = "Worker Container Linux Config snippets"
default = []
}
# configuration
variable "ssh_authorized_key" {
type = "string"
default = "1"
description = "Number of controllers"
description = "SSH public key for user 'core'"
}
variable "controller_type" {
type = "string"
default = "t2.small"
description = "Controller EC2 instance type"
}
variable "worker_count" {
type = "string"
default = "1"
description = "Number of workers"
}
variable "worker_type" {
type = "string"
default = "t2.small"
description = "Worker EC2 instance type"
}
# bootkube assets
variable "asset_dir" {
description = "Path to a directory where generated assets should be placed (contains secrets)"
type = "string"
@ -79,6 +95,12 @@ variable "network_mtu" {
default = "1480"
}
variable "host_cidr" {
description = "CIDR IPv4 range to assign to EC2 nodes"
type = "string"
default = "10.0.0.0/16"
}
variable "pod_cidr" {
description = "CIDR IPv4 range to assign Kubernetes pods"
type = "string"

View File

@ -16,4 +16,5 @@ module "workers" {
ssh_authorized_key = "${var.ssh_authorized_key}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
clc_snippets = "${var.worker_clc_snippets}"
}

View File

@ -39,8 +39,6 @@ systemd:
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
@ -89,8 +87,8 @@ storage:
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.4
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.10.1
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -108,7 +106,7 @@ storage:
--volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \
--insecure-options=image \
docker://gcr.io/google_containers/hyperkube:v1.9.4 \
docker://k8s.gcr.io/hyperkube:v1.10.1 \
--net=host \
--dns=host \
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)

View File

@ -1,21 +1,23 @@
variable "name" {
type = "string"
description = "Unique name instance group"
description = "Unique name for the worker pool"
}
# AWS
variable "vpc_id" {
type = "string"
description = "ID of the VPC for creating instances"
description = "Must be set to `vpc_id` output by cluster"
}
variable "subnet_ids" {
type = "list"
description = "List of subnet IDs for creating instances"
description = "Must be set to `subnet_ids` output by cluster"
}
variable "security_groups" {
type = "list"
description = "List of security group IDs"
description = "Must be set to `worker_security_groups` output by cluster"
}
# instances
@ -41,14 +43,26 @@ variable "os_channel" {
variable "disk_size" {
type = "string"
default = "40"
description = "Size of the disk in GB"
description = "Size of the EBS volume in GB"
}
variable "disk_type" {
type = "string"
default = "gp2"
description = "Type of the EBS volume (e.g. standard, gp2, io1)"
}
variable "clc_snippets" {
type = "list"
description = "Container Linux Config snippets"
default = []
}
# configuration
variable "kubeconfig" {
type = "string"
description = "Generated Kubelet kubeconfig"
description = "Must be set to `kubeconfig` output by cluster"
}
variable "ssh_authorized_key" {

View File

@ -42,7 +42,7 @@ resource "aws_launch_configuration" "worker" {
# storage
root_block_device {
volume_type = "standard"
volume_type = "${var.disk_type}"
volume_size = "${var.disk_size}"
}
@ -71,4 +71,5 @@ data "template_file" "worker_config" {
data "ct_config" "worker_ign" {
content = "${data.template_file.worker_config.rendered}"
pretty_print = false
snippets = ["${var.clc_snippets}"]
}

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.9.4 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Kubernetes v1.10.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)

View File

@ -1,6 +1,6 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=c5fc93d95fe4993511656cdd6372afbd1307f08f"
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=db36b92abced3c4b0af279adfd5ed4bf0cf8c39f"
cluster_name = "${var.cluster_name}"
api_servers = ["${var.k8s_domain_name}"]

View File

@ -12,6 +12,16 @@ systemd:
ExecStart=/opt/installer
[Install]
WantedBy=multi-user.target
# Avoid using the standard SSH port so terraform apply cannot SSH until
# post-install. But admins may SSH to debug disk install problems.
# After install, sshd will use port 22 and users/terraform can connect.
- name: sshd.socket
dropins:
- name: 10-sshd-port.conf
contents: |
[Socket]
ListenStream=
ListenStream=2222
storage:
files:
- path: /opt/installer
@ -32,11 +42,6 @@ storage:
systemctl reboot
passwd:
users:
# Avoid using standard name "core" so terraform apply cannot SSH until post-install.
- name: debug
create:
groups:
- sudo
- docker
- name: core
ssh_authorized_keys:
- {{.ssh_authorized_key}}
- "${ssh_authorized_key}"

View File

@ -7,12 +7,13 @@ systemd:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.3.2"
Environment="ETCD_IMAGE_TAG=v3.3.3"
Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${domain_name}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${domain_name}:2380"
Environment="ETCD_LISTEN_CLIENT_URLS=https://0.0.0.0:2379"
Environment="ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380"
Environment="ETCD_LISTEN_METRICS_URLS=http://0.0.0.0:2381"
Environment="ETCD_INITIAL_CLUSTER=${etcd_initial_cluster}"
Environment="ETCD_STRICT_RECONFIG_CHECK=true"
Environment="ETCD_SSL_DIR=/etc/ssl/etcd"
@ -90,6 +91,7 @@ systemd:
--lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni \
--node-labels=node-role.kubernetes.io/master \
--node-labels=node-role.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
@ -116,8 +118,8 @@ storage:
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.4
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.10.1
- path: /etc/hostname
filesystem: root
mode: 0644
@ -144,7 +146,7 @@ storage:
# Move experimental manifests
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.11.0}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.12.0}"
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
exec /usr/bin/rkt run \
--trust-keys-from-https \

View File

@ -47,8 +47,6 @@ systemd:
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
@ -81,8 +79,8 @@ storage:
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.4
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.10.1
- path: /etc/hostname
filesystem: root
mode: 0644

View File

@ -8,10 +8,6 @@ resource "matchbox_group" "container-linux-install" {
selector {
mac = "${element(concat(var.controller_macs, var.worker_macs), count.index)}"
}
metadata {
ssh_authorized_key = "${var.ssh_authorized_key}"
}
}
resource "matchbox_group" "controller" {

View File

@ -32,6 +32,7 @@ data "template_file" "container-linux-install-configs" {
ignition_endpoint = "${format("%s/ignition", var.matchbox_http_endpoint)}"
install_disk = "${var.install_disk}"
container_linux_oem = "${var.container_linux_oem}"
ssh_authorized_key = "${var.ssh_authorized_key}"
# only cached-container-linux profile adds -b baseurl
baseurl_flag = ""
@ -73,6 +74,7 @@ data "template_file" "cached-container-linux-install-configs" {
ignition_endpoint = "${format("%s/ignition", var.matchbox_http_endpoint)}"
install_disk = "${var.install_disk}"
container_linux_oem = "${var.container_linux_oem}"
ssh_authorized_key = "${var.ssh_authorized_key}"
# profile uses -b baseurl to install from matchbox cache
baseurl_flag = "-b ${var.matchbox_http_endpoint}/assets/coreos"

View File

@ -1,5 +1,5 @@
# Secure copy etcd TLS assets and kubeconfig to controllers. Activates kubelet.service
resource "null_resource" "copy-etcd-secrets" {
resource "null_resource" "copy-controller-secrets" {
count = "${length(var.controller_names)}"
connection {
@ -61,13 +61,13 @@ resource "null_resource" "copy-etcd-secrets" {
"sudo mv etcd-peer.key /etc/ssl/etcd/etcd/peer.key",
"sudo chown -R etcd:etcd /etc/ssl/etcd",
"sudo chmod -R 500 /etc/ssl/etcd",
"sudo mv /home/core/kubeconfig /etc/kubernetes/kubeconfig",
"sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
]
}
}
# Secure copy kubeconfig to all workers. Activates kubelet.service
resource "null_resource" "copy-kubeconfig" {
resource "null_resource" "copy-worker-secrets" {
count = "${length(var.worker_names)}"
connection {
@ -84,7 +84,7 @@ resource "null_resource" "copy-kubeconfig" {
provisioner "remote-exec" {
inline = [
"sudo mv /home/core/kubeconfig /etc/kubernetes/kubeconfig",
"sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
]
}
}
@ -95,13 +95,16 @@ resource "null_resource" "bootkube-start" {
# Without depends_on, this remote-exec may start before the kubeconfig copy.
# Terraform only does one task at a time, so it would try to bootstrap
# while no Kubelets are running.
depends_on = ["null_resource.copy-etcd-secrets", "null_resource.copy-kubeconfig"]
depends_on = [
"null_resource.copy-controller-secrets",
"null_resource.copy-worker-secrets",
]
connection {
type = "ssh"
host = "${element(var.controller_domains, 0)}"
user = "core"
timeout = "30m"
timeout = "15m"
}
provisioner "file" {
@ -111,7 +114,7 @@ resource "null_resource" "bootkube-start" {
provisioner "remote-exec" {
inline = [
"sudo mv /home/core/assets /opt/bootkube",
"sudo mv $HOME/assets /opt/bootkube",
"sudo systemctl start bootkube",
]
}

View File

@ -1,3 +1,10 @@
variable "cluster_name" {
type = "string"
description = "Unique cluster name"
}
# bare-metal
variable "matchbox_http_endpoint" {
type = "string"
description = "Matchbox HTTP read-only endpoint (e.g. http://matchbox.example.com:8080)"
@ -13,17 +20,7 @@ variable "container_linux_version" {
description = "Container Linux version of the kernel/initrd to PXE or the image to install"
}
variable "cluster_name" {
type = "string"
description = "Cluster name"
}
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key to set as an authorized_key on machines"
}
# Machines
# machines
# Terraform's crude "type system" does not properly support lists of maps so we do this.
variable "controller_names" {
@ -50,13 +47,18 @@ variable "worker_domains" {
type = "list"
}
# bootkube assets
# configuration
variable "k8s_domain_name" {
description = "Controller DNS name which resolves to a controller instance. Workers and kubeconfig's will communicate with this endpoint (e.g. cluster.example.com)"
type = "string"
}
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key for user 'core'"
}
variable "asset_dir" {
description = "Path to a directory where generated assets should be placed (contains secrets)"
type = "string"
@ -75,14 +77,14 @@ variable "network_mtu" {
}
variable "pod_cidr" {
description = "CIDR IP range to assign Kubernetes pods"
description = "CIDR IPv4 range to assign Kubernetes pods"
type = "string"
default = "10.2.0.0/16"
}
variable "service_cidr" {
description = <<EOD
CIDR IP range to assign Kubernetes services.
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
EOD

View File

@ -1,117 +0,0 @@
---
systemd:
units:
- name: docker.service
enable: true
- name: locksmithd.service
mask: true
- name: kubelet.path
enable: true
contents: |
[Unit]
Description=Watch for kubeconfig
[Path]
PathExists=/etc/kubernetes/kubeconfig
[Install]
WantedBy=multi-user.target
- name: wait-for-dns.service
enable: true
contents: |
[Unit]
Description=Wait for DNS entries
Wants=systemd-resolved.service
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/sh -c 'while ! /usr/bin/grep '^[^#[:space:]]' /etc/resolv.conf > /dev/null; do sleep 1; done'
[Install]
RequiredBy=kubelet.service
- name: kubelet.service
contents: |
[Unit]
Description=Kubelet via Hyperkube
Wants=rpc-statd.service
[Service]
EnvironmentFile=/etc/kubernetes/kubelet.env
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/cache/kubelet-pod.uuid \
--volume=resolv,kind=host,source=/etc/resolv.conf \
--mount volume=resolv,target=/etc/resolv.conf \
--volume var-lib-cni,kind=host,source=/var/lib/cni \
--mount volume=var-lib-cni,target=/var/lib/cni \
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
--mount volume=opt-cni-bin,target=/opt/cni/bin \
--volume var-log,kind=host,source=/var/log \
--mount volume=var-log,target=/var/log \
--insecure-options=image"
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
--allow-privileged \
--anonymous-auth=false \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns={{.k8s_dns_service_ip}} \
--cluster_domain={{.cluster_domain_suffix}} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--hostname-override={{.domain_name}} \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni \
--node-labels=node-role.kubernetes.io/node \
--pod-manifest-path=/etc/kubernetes/manifests \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
storage:
{{ if index . "pxe" }}
disks:
- device: /dev/sda
wipe_table: true
partitions:
- label: ROOT
filesystems:
- name: root
mount:
device: "/dev/sda1"
format: "ext4"
create:
force: true
options:
- "-LROOT"
{{end}}
files:
- path: /etc/kubernetes/kubelet.env
filesystem: root
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.4
- path: /etc/hostname
filesystem: root
mode: 0644
contents:
inline:
{{.domain_name}}
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
inline: |
fs.inotify.max_user_watches=16184
passwd:
users:
- name: core
ssh_authorized_keys:
- {{.ssh_authorized_key}}

View File

@ -1,19 +0,0 @@
resource "matchbox_group" "workers" {
count = "${length(var.worker_names)}"
name = "${format("%s-%s", var.cluster_name, element(var.worker_names, count.index))}"
profile = "${matchbox_profile.bootkube-worker-pxe.name}"
selector {
mac = "${element(var.worker_macs, count.index)}"
}
metadata {
pxe = "true"
domain_name = "${element(var.worker_domains, count.index)}"
etcd_endpoints = "${join(",", formatlist("%s:2379", var.controller_domains))}"
k8s_dns_service_ip = "${var.kube_dns_service_ip}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
ssh_authorized_key = "${var.ssh_authorized_key}"
}
}

View File

@ -1,20 +0,0 @@
// Container Linux Install profile (from release.core-os.net)
resource "matchbox_profile" "bootkube-worker-pxe" {
name = "bootkube-worker-pxe"
kernel = "http://${var.container_linux_channel}.release.core-os.net/amd64-usr/${var.container_linux_version}/coreos_production_pxe.vmlinuz"
initrd = [
"http://${var.container_linux_channel}.release.core-os.net/amd64-usr/${var.container_linux_version}/coreos_production_pxe_image.cpio.gz",
]
args = [
"initrd=coreos_production_pxe_image.cpio.gz",
"coreos.config.url=${var.matchbox_http_endpoint}/ignition?uuid=$${uuid}&mac=$${mac:hexhyp}",
"coreos.first_boot=yes",
"console=tty0",
"console=ttyS0",
"${var.kernel_args}",
]
container_linux_config = "${file("${path.module}/cl/bootkube-worker.yaml.tmpl")}"
}

View File

@ -1,22 +0,0 @@
# Secure copy kubeconfig to all nodes to activate kubelet.service
resource "null_resource" "copy-kubeconfig" {
count = "${length(var.worker_names)}"
connection {
type = "ssh"
host = "${element(var.worker_domains, count.index)}"
user = "core"
timeout = "60m"
}
provisioner "file" {
content = "${var.kubeconfig}"
destination = "$HOME/kubeconfig"
}
provisioner "remote-exec" {
inline = [
"sudo mv /home/core/kubeconfig /etc/kubernetes/kubeconfig",
]
}
}

View File

@ -1,72 +0,0 @@
variable "cluster_name" {
description = "Cluster name"
type = "string"
}
variable "matchbox_http_endpoint" {
type = "string"
description = "Matchbox HTTP read-only endpoint (e.g. http://matchbox.example.com:8080)"
}
variable "container_linux_channel" {
type = "string"
description = "Container Linux channel corresponding to the container_linux_version"
}
variable "container_linux_version" {
type = "string"
description = "Container Linux version of the kernel/initrd to PXE or the image to install"
}
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key to set as an authorized key"
}
# machines
# Terraform's crude "type system" does properly support lists of maps so we do this.
variable "controller_domains" {
type = "list"
}
variable "worker_names" {
type = "list"
}
variable "worker_macs" {
type = "list"
}
variable "worker_domains" {
type = "list"
}
# bootkube
variable "kubeconfig" {
type = "string"
}
variable "kube_dns_service_ip" {
description = "Kubernetes service IP for kube-dns (must be within server_cidr)"
type = "string"
default = "10.3.0.10"
}
# optional
variable "kernel_args" {
description = "Additional kernel arguments to provide at PXE boot."
type = "list"
default = [
"root=/dev/sda1",
]
}
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by kube-dns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = "string"
default = "cluster.local"
}

View File

@ -11,8 +11,8 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.9.4 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* Kubernetes v1.10.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)

View File

@ -1,12 +1,12 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=c5fc93d95fe4993511656cdd6372afbd1307f08f"
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=db36b92abced3c4b0af279adfd5ed4bf0cf8c39f"
cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
etcd_servers = "${digitalocean_record.etcds.*.fqdn}"
asset_dir = "${var.asset_dir}"
networking = "${var.networking}"
networking = "flannel"
network_mtu = 1440
pod_cidr = "${var.pod_cidr}"
service_cidr = "${var.service_cidr}"

View File

@ -7,12 +7,13 @@ systemd:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.3.2"
Environment="ETCD_IMAGE_TAG=v3.3.3"
Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
Environment="ETCD_LISTEN_CLIENT_URLS=https://0.0.0.0:2379"
Environment="ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380"
Environment="ETCD_LISTEN_METRICS_URLS=http://0.0.0.0:2381"
Environment="ETCD_INITIAL_CLUSTER=${etcd_initial_cluster}"
Environment="ETCD_STRICT_RECONFIG_CHECK=true"
Environment="ETCD_SSL_DIR=/etc/ssl/etcd"
@ -93,6 +94,7 @@ systemd:
--lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni \
--node-labels=node-role.kubernetes.io/master \
--node-labels=node-role.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
@ -121,8 +123,8 @@ storage:
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.4
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.10.1
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -143,7 +145,7 @@ storage:
# Move experimental manifests
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.11.0}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.12.0}"
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
exec /usr/bin/rkt run \
--trust-keys-from-https \

View File

@ -50,8 +50,6 @@ systemd:
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
@ -95,8 +93,8 @@ storage:
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.4
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.10.1
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -114,7 +112,7 @@ storage:
--volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \
--insecure-options=image \
docker://gcr.io/google_containers/hyperkube:v1.9.4 \
docker://k8s.gcr.io/hyperkube:v1.10.1 \
--net=host \
--dns=host \
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)

View File

@ -90,4 +90,6 @@ data "ct_config" "controller_ign" {
count = "${var.controller_count}"
content = "${element(data.template_file.controller_config.*.rendered, count.index)}"
pretty_print = false
snippets = ["${var.controller_clc_snippets}"]
}

View File

@ -1,10 +1,10 @@
# Secure copy kubeconfig to all nodes. Activates kubelet.service
resource "null_resource" "copy-secrets" {
count = "${var.controller_count + var.worker_count}"
# Secure copy etcd TLS assets and kubeconfig to controllers. Activates kubelet.service
resource "null_resource" "copy-controller-secrets" {
count = "${var.controller_count}"
connection {
type = "ssh"
host = "${element(concat(digitalocean_droplet.controllers.*.ipv4_address, digitalocean_droplet.workers.*.ipv4_address), count.index)}"
host = "${element(concat(digitalocean_droplet.controllers.*.ipv4_address), count.index)}"
user = "core"
timeout = "15m"
}
@ -61,7 +61,30 @@ resource "null_resource" "copy-secrets" {
"sudo mv etcd-peer.key /etc/ssl/etcd/etcd/peer.key",
"sudo chown -R etcd:etcd /etc/ssl/etcd",
"sudo chmod -R 500 /etc/ssl/etcd",
"sudo mv /home/core/kubeconfig /etc/kubernetes/kubeconfig",
"sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
]
}
}
# Secure copy kubeconfig to all workers. Activates kubelet.service.
resource "null_resource" "copy-worker-secrets" {
count = "${var.worker_count}"
connection {
type = "ssh"
host = "${element(concat(digitalocean_droplet.workers.*.ipv4_address), count.index)}"
user = "core"
timeout = "15m"
}
provisioner "file" {
content = "${module.bootkube.kubeconfig}"
destination = "$HOME/kubeconfig"
}
provisioner "remote-exec" {
inline = [
"sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
]
}
}
@ -69,7 +92,11 @@ resource "null_resource" "copy-secrets" {
# Secure copy bootkube assets to ONE controller and start bootkube to perform
# one-time self-hosted cluster bootstrapping.
resource "null_resource" "bootkube-start" {
depends_on = ["module.bootkube", "null_resource.copy-secrets"]
depends_on = [
"module.bootkube",
"null_resource.copy-controller-secrets",
"null_resource.copy-worker-secrets",
]
connection {
type = "ssh"
@ -85,7 +112,7 @@ resource "null_resource" "bootkube-start" {
provisioner "remote-exec" {
inline = [
"sudo mv /home/core/assets /opt/bootkube",
"sudo mv $HOME/assets /opt/bootkube",
"sudo systemctl start bootkube",
]
}

View File

@ -1,8 +1,10 @@
variable "cluster_name" {
type = "string"
description = "Unique cluster name"
description = "Unique cluster name (prepended to dns_zone)"
}
# Digital Ocean
variable "region" {
type = "string"
description = "Digital Ocean region (e.g. nyc1, sfo2, fra1, tor1)"
@ -13,22 +15,12 @@ variable "dns_zone" {
description = "Digital Ocean domain (i.e. DNS zone) (e.g. do.example.com)"
}
variable "image" {
type = "string"
default = "coreos-stable"
description = "OS image from which to initialize the disk (e.g. coreos-stable)"
}
# instances
variable "controller_count" {
type = "string"
default = "1"
description = "Number of controllers"
}
variable "controller_type" {
type = "string"
default = "s-2vcpu-2gb"
description = "Digital Ocean droplet size (e.g. s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb)."
description = "Number of controllers (i.e. masters)"
}
variable "worker_count" {
@ -37,39 +29,57 @@ variable "worker_count" {
description = "Number of workers"
}
variable "controller_type" {
type = "string"
default = "s-2vcpu-2gb"
description = "Droplet type for controllers (e.g. s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb)."
}
variable "worker_type" {
type = "string"
default = "s-1vcpu-1gb"
description = "Digital Ocean droplet size (e.g. s-1vcpu-1gb, s-1vcpu-2gb, s-2vcpu-2gb)"
description = "Droplet type for workers (e.g. s-1vcpu-1gb, s-1vcpu-2gb, s-2vcpu-2gb)"
}
variable "image" {
type = "string"
default = "coreos-stable"
description = "Container Linux image for instances (e.g. coreos-stable)"
}
variable "controller_clc_snippets" {
type = "list"
description = "Controller Container Linux Config snippets"
default = []
}
variable "worker_clc_snippets" {
type = "list"
description = "Worker Container Linux Config snippets"
default = []
}
# configuration
variable "ssh_fingerprints" {
type = "list"
description = "SSH public key fingerprints. (e.g. see `ssh-add -l -E md5`)"
}
# bootkube assets
variable "asset_dir" {
description = "Path to a directory where generated assets should be placed (contains secrets)"
type = "string"
}
variable "networking" {
description = "Choice of networking provider (flannel or calico)"
type = "string"
default = "flannel"
}
variable "pod_cidr" {
description = "CIDR IP range to assign Kubernetes pods"
description = "CIDR IPv4 range to assign Kubernetes pods"
type = "string"
default = "10.2.0.0/16"
}
variable "service_cidr" {
description = <<EOD
CIDR IP range to assign Kubernetes services.
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
EOD

View File

@ -51,4 +51,5 @@ data "template_file" "worker_config" {
data "ct_config" "worker_ign" {
content = "${data.template_file.worker_config.rendered}"
pretty_print = false
snippets = ["${var.worker_clc_snippets}"]
}

View File

@ -1,6 +1,130 @@
# Customization
To customize clusters in ways that aren't supported by input variables, fork the repo and make changes to the Terraform module. Stay tuned for improvements to this strategy since it is beneficial to stay close to this upstream.
Typhoon provides minimal Kubernetes clusters with defaults we recommend for production. Terraform variables provide easy to use and supported customizations for clusters. Advanced options are available for customizing the architecture or hosts.
## Variables
Typhoon modules accept Terraform input variables for customizing clusters in meritorious ways (e.g. `worker_count`, etc). Variables are carefully considered to provide essentials, while limiting complexity and test matrix burden. See each platform's tutorial for options.
## Addons
Clusters are kept to a minimal Kubernetes control plane by offering components like Nginx Ingress Controller, Prometheus, Grafana, and Heapster as optional post-install [addons](https://github.com/poseidon/typhoon/tree/master/addons). Customize addons by modifying a copy of our addon manifests.
## Hosts
### Container Linux
!!! danger
Container Linux Configs provide powerful host customization abilities. You are responsible for the additional configs defined for hosts.
Container Linux Configs (CLCs) declare how a Container Linux instance's disk should be provisioned on first boot from disk. CLCs define disk partitions, filesystems, files, systemd units, dropins, networkd configs, mount units, raid arrays, and users. Typhoon creates controller and worker instances with base Container Linux Configs to create a minimal, secure Kubernetes cluster on each platform.
Typhoon AWS, Google Cloud, and Digital Ocean give users the ability to provide CLC *snippets* - valid Container Linux Configs that are validated and additively merged into the Typhoon base config during `terraform plan`. This allows advanced host customizations and experimentation.
#### Examples
Container Linux [docs](https://coreos.com/os/docs/latest/clc-examples.html) show many simple config examples. Ensure a file `/opt/hello` is created with permissions 0644.
```
# custom-files
storage:
files:
- path: /opt/hello
filesystem: root
contents:
inline: |
Hello World
mode: 0644
```
Ensure a systemd unit `hello.service` is created and a dropin `50-etcd-cluster.conf` is added for `etcd-member.service`.
```
# custom-units
systemd:
units:
- name: hello.service
enable: true
contents: |
[Unit]
Description=Hello World
[Service]
Type=oneshot
ExecStart=/usr/bin/echo Hello World!
[Install]
WantedBy=multi-user.target
- name: etcd-member.service
enable: true
dropins:
- name: 50-etcd-cluster.conf
contents: |
Environment="ETCD_LOG_PACKAGE_LEVELS=etcdserver=WARNING,security=DEBUG"
```
#### Specification
View the Container Linux Config [format](https://coreos.com/os/docs/1576.4.0/configuration.html) to read about each field.
#### Usage
Write Container Linux Configs *snippets* as files in the repository where you keep Terraform configs for clusters (perhaps in a `clc` or `snippets` subdirectory). You may organize snippets in multiple files as desired, provided they are each valid.
Define an [AWS](https://typhoon.psdn.io/aws/#cluster), [Google Cloud](https://typhoon.psdn.io/google-cloud/#cluster), or [Digital Ocean](https://typhoon.psdn.io/digital-ocean/#cluster) cluster and fill in the optional `controller_clc_snippets` or `worker_clc_snippets` fields.
```
module "digital-ocean-nemo" {
...
controller_count = 1
worker_count = 2
controller_clc_snippets = [
"${file("./custom-files")}",
"${file("./custom-units")}",
]
worker_clc_snippets = [
"${file("./custom-files")}",
"${file("./custom-units")}",
]
...
}
```
Plan the resources to be created.
```
$ terraform plan
Plan: 54 to add, 0 to change, 0 to destroy.
```
Most syntax errors in CLCs can be caught during planning. For example, mangle the indentation in one of the CLC files:
```
$ terraform plan
...
error parsing Container Linux Config: error: yaml: line 3: did not find expected '-' indicator
```
Undo the mangle. Apply the changes to create the cluster per the tutorial.
```
$ terraform apply
```
Container Linux Configs (and the CoreOS Ignition system) create immutable infrastructure. Disk provisioning is performed only on first boot from disk. That means if you change a snippet used by an instance, Terraform will (correctly) try to destroy and recreate that instance. Be careful!
!!! danger
Destroying and recreating controller instances is destructive! etcd runs on controller instances and stores data there. Do not modify controller snippets. See [blue/green](https://typhoon.psdn.io/topics/maintenance/#upgrades) clusters.
## Architecture
To customize clusters in ways that aren't supported by input variables, fork Typhoon and maintain a repository with customizations. Reference the repository by changing the username.
```
module "digital-ocean-nemo" {
source = "git::https://github.com/USERNAME/typhoon//digital-ocean/container-linux/kubernetes?ref=myspecialcase"
...
}
```
To customize lower-level Kubernetes control plane bootstrapping, see the [poseidon/bootkube-terraform](https://github.com/poseidon/bootkube-terraform) Terraform module.

View File

@ -13,7 +13,7 @@ Create a cluster following the AWS [tutorial](../aws.md#cluster). Define a worke
```tf
module "tempest-worker-pool" {
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes/workers?ref=v1.9.4"
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes/workers?ref=v1.10.1"
providers = {
aws = "aws.default"
@ -56,7 +56,7 @@ The AWS internal `workers` module supports a number of [variables](https://githu
| security_groups | Must be set to `worker_security_groups` output by cluster | "${module.cluster.worker_security_groups}" |
| name | Unique name (distinct from cluster name) | "tempest-m5s" |
| kubeconfig | Must be set to `kubeconfig` output by cluster | "${module.cluster.kubeconfig}" |
| ssh_authorized_key | SSH public key for ~/.ssh_authorized_keys | "ssh-rsa AAAAB3NZ..." |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." |
#### Optional
@ -77,7 +77,7 @@ Create a cluster following the Google Cloud [tutorial](../google-cloud.md#cluste
```tf
module "yavin-worker-pool" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes/workers?ref=v1.9.4"
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes/workers?ref=v1.10.1"
providers = {
google = "google.default"
@ -111,11 +111,11 @@ Verify a managed instance group of workers joins the cluster within a few minute
```
$ kubectl get nodes
NAME STATUS AGE VERSION
yavin-controller-0.c.example-com.internal Ready 6m v1.9.4
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.9.4
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.9.4
yavin-16x-worker-jrbf.c.example-com.internal Ready 3m v1.9.4
yavin-16x-worker-mzdm.c.example-com.internal Ready 3m v1.9.4
yavin-controller-0.c.example-com.internal Ready 6m v1.10.1
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.10.1
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.10.1
yavin-16x-worker-jrbf.c.example-com.internal Ready 3m v1.10.1
yavin-16x-worker-mzdm.c.example-com.internal Ready 3m v1.10.1
```
### Variables
@ -131,7 +131,7 @@ The Google Cloud internal `workers` module supports a number of [variables](http
| name | Unique name (distinct from cluster name) | "yavin-16x" |
| cluster_name | Must be set to `cluster_name` of cluster | "yavin" |
| kubeconfig | Must be set to `kubeconfig` output by cluster | "${module.cluster.kubeconfig}" |
| ssh_authorized_key | SSH public key for ~/.ssh_authorized_keys | "ssh-rsa AAAAB3NZ..." |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." |
#### Optional
@ -139,7 +139,7 @@ The Google Cloud internal `workers` module supports a number of [variables](http
|:-----|:------------|:--------|:--------|
| count | Number of instances | 1 | 3 |
| machine_type | Compute instance machine type | "n1-standard-1" | See below |
| os_image | OS image for compute instances | "coreos-stable" | "coreos-alpha", "coreos-beta" |
| os_image | Container Linux image for compute instances | "coreos-stable" | "coreos-alpha", "coreos-beta" |
| disk_size | Size of the disk in GB | 40 | 100 |
| preemptible | If true, Compute Engine will terminate instances randomly within 24 hours | false | true |
| service_cidr | Must match `service_cidr` of cluster | "10.3.0.0/16" | "10.3.0.0/24" |

View File

@ -1,6 +1,6 @@
# AWS
In this tutorial, we'll create a Kubernetes v1.9.4 cluster on AWS.
In this tutorial, we'll create a Kubernetes v1.10.1 cluster on AWS.
We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, a VPC, gateway, subnets, auto-scaling groups of controllers and workers, network load balancers for controllers and workers, and security groups will be created.
@ -57,7 +57,7 @@ Configure the AWS provider to use your access key credentials in a `providers.tf
```tf
provider "aws" {
version = "~> 1.5.0"
version = "~> 1.11.0"
alias = "default"
region = "eu-central-1"
@ -96,7 +96,7 @@ Define a Kubernetes cluster using the module `aws/container-linux/kubernetes`.
```tf
module "aws-tempest" {
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes?ref=v1.9.4"
source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes?ref=v1.10.1"
providers = {
aws = "aws.default"
@ -105,20 +105,19 @@ module "aws-tempest" {
template = "template.default"
tls = "tls.default"
}
cluster_name = "tempest"
# AWS
dns_zone = "aws.example.com"
dns_zone_id = "Z3PAABBCFAKEC0"
controller_count = 1
controller_type = "t2.medium"
worker_count = 2
worker_type = "t2.small"
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
cluster_name = "tempest"
dns_zone = "aws.example.com"
dns_zone_id = "Z3PAABBCFAKEC0"
# bootkube
asset_dir = "/home/user/.secrets/clusters/tempest"
# configuration
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
asset_dir = "/home/user/.secrets/clusters/tempest"
# optional
worker_count = 2
worker_type = "t2.medium"
}
```
@ -150,7 +149,7 @@ Get or update Terraform modules.
$ terraform get # downloads missing modules
$ terraform get --update # updates all modules
Get: git::https://github.com/poseidon/typhoon (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.11.0 (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.12.0 (update)
```
Plan the resources to be created.
@ -182,9 +181,9 @@ In 4-8 minutes, the Kubernetes cluster will be ready.
$ export KUBECONFIG=/home/user/.secrets/clusters/tempest/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
ip-10-0-12-221 Ready 34m v1.9.4
ip-10-0-19-112 Ready 34m v1.9.4
ip-10-0-4-22 Ready 34m v1.9.4
ip-10-0-12-221 Ready 34m v1.10.1
ip-10-0-19-112 Ready 34m v1.10.1
ip-10-0-4-22 Ready 34m v1.10.1
```
List the pods.
@ -225,7 +224,6 @@ Learn about [maintenance](topics/maintenance.md) and [addons](addons/overview.md
| dns_zone | AWS Route53 DNS zone | "aws.example.com" |
| dns_zone_id | AWS Route53 DNS zone id | "Z3PAABBCFAKEC0" |
| ssh_authorized_key | SSH public key for ~/.ssh_authorized_keys | "ssh-rsa AAAAB3NZ..." |
| os_channel | Container Linux AMI channel | stable, beta, alpha |
| asset_dir | Path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/tempest" |
#### DNS Zone
@ -250,15 +248,19 @@ Reference the DNS zone id with `"${aws_route53_zone.zone-for-clusters.zone_id}"`
| Name | Description | Default | Example |
|:-----|:------------|:--------|:--------|
| controller_count | Number of controllers (i.e. masters) | 1 | 1 |
| controller_type | Controller EC2 instance type | "t2.small" | "t2.medium" |
| worker_count | Number of workers | 1 | 3 |
| worker_type | Worker EC2 instance type | "t2.small" | "t2.medium" |
| controller_type | EC2 instance type for controllers | "t2.small" | See below |
| worker_type | EC2 instance type for workers | "t2.small" | See below |
| os_channel | Container Linux AMI channel | stable | stable, beta, alpha |
| disk_size | Size of the EBS volume in GB | "40" | "100" |
| disk_type | Type of the EBS volume | "gp2" | standard, gp2, io1 |
| controller_clc_snippets | Controller Container Linux Config snippets | [] | |
| worker_clc_snippets | Worker Container Linux Config snippets | [] | |
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
| network_mtu | CNI interface MTU (calico only) | 1480 | 8981 |
| host_cidr | CIDR range to assign to EC2 instances | "10.0.0.0/16" | "10.1.0.0/16" |
| pod_cidr | CIDR range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| host_cidr | CIDR IPv4 range to assign to EC2 instances | "10.0.0.0/16" | "10.1.0.0/16" |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by kube-dns. | "cluster.local" | "k8s.example.com" |
Check the list of valid [instance types](https://aws.amazon.com/ec2/instance-types/).

View File

@ -1,6 +1,6 @@
# Bare-Metal
In this tutorial, we'll network boot and provision a Kubernetes v1.9.4 cluster on bare-metal.
In this tutorial, we'll network boot and provision a Kubernetes v1.10.1 cluster on bare-metal.
First, we'll deploy a [Matchbox](https://github.com/coreos/matchbox) service and setup a network boot environment. Then, we'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Container Linux to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers.
@ -22,10 +22,7 @@ Collect a MAC address from each machine. For machines with multiple PXE-enabled
* 52:54:00:b2:2f:86 (node2)
* 52:54:00:c3:61:77 (node3)
Configure each machine to boot from the disk [^1] through IPMI or the BIOS menu.
[^1]: Configuring "diskless" workers that always PXE boot is possible, but not in the scope of this tutorial.
Configure each machine to boot from the disk through IPMI or the BIOS menu.
```
ipmitool -H node1 -U USER -P PASS chassis bootdev disk options=persistent
@ -177,7 +174,7 @@ Define a Kubernetes cluster using the module `bare-metal/container-linux/kuberne
```tf
module "bare-metal-mercury" {
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.9.4"
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.10.1"
providers = {
local = "local.default"
@ -186,15 +183,16 @@ module "bare-metal-mercury" {
tls = "tls.default"
}
# install
# bare-metal
cluster_name = "mercury"
matchbox_http_endpoint = "http://matchbox.example.com"
container_linux_channel = "stable"
container_linux_version = "1632.3.0"
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
# cluster
cluster_name = "mercury"
k8s_domain_name = "node1.example.com"
# configuration
k8s_domain_name = "node1.example.com"
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
asset_dir = "/home/user/.secrets/clusters/mercury"
# machines
controller_names = ["node1"]
@ -212,9 +210,6 @@ module "bare-metal-mercury" {
"node2.example.com",
"node3.example.com",
]
# output assets dir
asset_dir = "/home/user/.secrets/clusters/mercury"
}
```
@ -246,7 +241,7 @@ Get or update Terraform modules.
$ terraform get # downloads missing modules
$ terraform get --update # updates all modules
Get: git::https://github.com/poseidon/typhoon (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.11.0 (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.12.0 (update)
```
Plan the resources to be created.
@ -297,10 +292,19 @@ module.bare-metal-mercury.null_resource.bootkube-start: Creation complete (ID: 5
Apply complete! Resources: 55 added, 0 changed, 0 destroyed.
```
To watch the install to disk (until machines reboot from disk), SSH to port 2222.
```
# before v1.10.1
$ ssh debug@node1.example.com
# after v1.10.1
$ ssh -p 2222 core@node1.example.com
```
To watch the bootstrap process in detail, SSH to the first controller and journal the logs.
```
$ ssh node1.example.com
$ ssh core@node1.example.com
$ journalctl -f -u bootkube
bootkube[5]: Pod Status: pod-checkpointer Running
bootkube[5]: Pod Status: kube-apiserver Running
@ -318,9 +322,9 @@ bootkube[5]: Tearing down temporary bootstrap control plane...
$ export KUBECONFIG=/home/user/.secrets/clusters/mercury/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
node1.example.com Ready 11m v1.9.4
node2.example.com Ready 11m v1.9.4
node3.example.com Ready 11m v1.9.4
node1.example.com Ready 11m v1.10.1
node2.example.com Ready 11m v1.10.1
node3.example.com Ready 11m v1.10.1
```
List the pods.
@ -357,19 +361,19 @@ Learn about [maintenance](topics/maintenance.md) and [addons](addons/overview.md
| Name | Description | Example |
|:-----|:------------|:--------|
| cluster_name | Unique cluster name | mercury |
| matchbox_http_endpoint | Matchbox HTTP read-only endpoint | http://matchbox.example.com:8080 |
| container_linux_channel | Container Linux channel | stable, beta, alpha |
| container_linux_version | Container Linux version of the kernel/initrd to PXE and the image to install | 1632.3.0 |
| cluster_name | Cluster name | mercury |
| k8s_domain_name | FQDN resolving to the controller(s) nodes. Workers and kubectl will communicate with this endpoint | "myk8s.example.com" |
| ssh_authorized_key | SSH public key for ~/.ssh/authorized_keys | "ssh-rsa AAAAB3Nz..." |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3Nz..." |
| asset_dir | Path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/mercury" |
| controller_names | Ordered list of controller short names | ["node1"] |
| controller_macs | Ordered list of controller identifying MAC addresses | ["52:54:00:a1:9c:ae"] |
| controller_domains | Ordered list of controller FQDNs | ["node1.example.com"] |
| worker_names | Ordered list of worker short names | ["node2", "node3"] |
| worker_macs | Ordered list of worker identifying MAC addresses | ["52:54:00:b2:2f:86", "52:54:00:c3:61:77"] |
| worker_domains | Ordered list of worker FQDNs | ["node2.example.com", "node3.example.com"] |
| asset_dir | Path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/mercury" |
### Optional
@ -380,8 +384,8 @@ Learn about [maintenance](topics/maintenance.md) and [addons](addons/overview.md
| container_linux_oem | Specify alternative OEM image ids for the disk install | "" | "vmware_raw", "xen" |
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
| network_mtu | CNI interface MTU (calico-only) | 1480 | - |
| pod_cidr | CIDR range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by kube-dns. | "cluster.local" | "k8s.example.com" |
| kernel_args | Additional kernel args to provide at PXE boot | [] | "kvm-intel.nested=1" |

View File

@ -1,6 +1,6 @@
# Digital Ocean
In this tutorial, we'll create a Kubernetes v1.9.4 cluster on Digital Ocean.
In this tutorial, we'll create a Kubernetes v1.10.1 cluster on Digital Ocean.
We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, firewall rules, DNS records, tags, and droplets for Kubernetes controllers and workers will be created.
@ -90,7 +90,7 @@ Define a Kubernetes cluster using the module `digital-ocean/container-linux/kube
```tf
module "digital-ocean-nemo" {
source = "git::https://github.com/poseidon/typhoon//digital-ocean/container-linux/kubernetes?ref=v1.9.4"
source = "git::https://github.com/poseidon/typhoon//digital-ocean/container-linux/kubernetes?ref=v1.10.1"
providers = {
digitalocean = "digitalocean.default"
@ -100,19 +100,18 @@ module "digital-ocean-nemo" {
tls = "tls.default"
}
region = "nyc3"
dns_zone = "digital-ocean.example.com"
# Digital Ocean
cluster_name = "nemo"
region = "nyc3"
dns_zone = "digital-ocean.example.com"
cluster_name = "nemo"
image = "coreos-stable"
controller_count = 1
controller_type = "s-2vcpu-2gb"
worker_count = 2
worker_type = "s-1vcpu-1gb"
# configuration
ssh_fingerprints = ["d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7"]
# output assets dir
asset_dir = "/home/user/.secrets/clusters/nemo"
asset_dir = "/home/user/.secrets/clusters/nemo"
# optional
worker_count = 2
worker_type = "s-1vcpu-1gb"
}
```
@ -144,7 +143,7 @@ Get or update Terraform modules.
$ terraform get # downloads missing modules
$ terraform get --update # updates all modules
Get: git::https://github.com/poseidon/typhoon (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.11.0 (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.12.0 (update)
```
Plan the resources to be created.
@ -177,9 +176,9 @@ In 3-6 minutes, the Kubernetes cluster will be ready.
$ export KUBECONFIG=/home/user/.secrets/clusters/nemo/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
10.132.110.130 Ready 10m v1.9.4
10.132.115.81 Ready 10m v1.9.4
10.132.124.107 Ready 10m v1.9.4
10.132.110.130 Ready 10m v1.10.1
10.132.115.81 Ready 10m v1.10.1
10.132.124.107 Ready 10m v1.10.1
```
List the pods.
@ -260,20 +259,18 @@ Digital Ocean requires the SSH public key be uploaded to your account, so you ma
| Name | Description | Default | Example |
|:-----|:------------|:--------|:--------|
| image | OS image for droplets | "coreos-stable" | coreos-stable, coreos-beta, coreos-alpha |
| controller_count | Number of controllers (i.e. masters) | 1 | 1 |
| controller_type | Digital Ocean droplet size | s-2vcpu-2gb | s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb, ... |
| worker_count | Number of workers | 1 | 3 |
| worker_type | Digital Ocean droplet size | s-1vcpu-1gb | s-1vcpu-1gb, s-1vcpu-2gb, s-2vcpu-2gb, ... |
| networking | Choice of networking provider | "flannel" | "flannel" |
| pod_cidr | CIDR range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| controller_type | Droplet type for controllers | s-2vcpu-2gb | s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb, ... |
| worker_type | Droplet type for workers | s-1vcpu-1gb | s-1vcpu-1gb, s-1vcpu-2gb, s-2vcpu-2gb, ... |
| image | Container Linux image for instances | "coreos-stable" | coreos-stable, coreos-beta, coreos-alpha |
| controller_clc_snippets | Controller Container Linux Config snippets | [] | |
| worker_clc_snippets | Worker Container Linux Config snippets | [] | |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by kube-dns. | "cluster.local" | "k8s.example.com" |
Check the list of valid [droplet types](https://developers.digitalocean.com/documentation/changelog/api-v2/new-size-slugs-for-droplet-plan-changes/) or use `doctl compute size list`.
!!! warning
Do not choose a `controller_type` smaller than 2GB. Smaller droplets are not sufficient for running a controller and bootstrapping will fail.
!!! bug
Digital Ocean firewalls do not yet support the IP tunneling (IP in IP) protocol used by Calico. You can try using "calico" for `networking`, but it will only work if the cloud firewall is removed (unsafe).

View File

@ -1,6 +1,6 @@
# Google Cloud
In this tutorial, we'll create a Kubernetes v1.9.4 cluster on Google Compute Engine (not GKE).
In this tutorial, we'll create a Kubernetes v1.10.1 cluster on Google Compute Engine (not GKE).
We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, a network, firewall rules, managed instance groups of Kubernetes controllers and workers, network load balancers for controllers and workers, and health checks will be created.
@ -57,7 +57,7 @@ Configure the Google Cloud provider to use your service account key, project-id,
```tf
provider "google" {
version = "1.2"
version = "1.6"
alias = "default"
credentials = "${file("~/.config/google-cloud/terraform.json")}"
@ -97,7 +97,7 @@ Define a Kubernetes cluster using the module `google-cloud/container-linux/kuber
```tf
module "google-cloud-yavin" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.9.4"
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.10.1"
providers = {
google = "google.default"
@ -108,18 +108,17 @@ module "google-cloud-yavin" {
}
# Google Cloud
cluster_name = "yavin"
region = "us-central1"
dns_zone = "example.com"
dns_zone_name = "example-zone"
os_image = "coreos-stable"
cluster_name = "yavin"
controller_count = 1
worker_count = 2
# configuration
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
# output assets dir
asset_dir = "/home/user/.secrets/clusters/yavin"
asset_dir = "/home/user/.secrets/clusters/yavin"
# optional
worker_count = 2
}
```
@ -151,7 +150,7 @@ Get or update Terraform modules.
$ terraform get # downloads missing modules
$ terraform get --update # updates all modules
Get: git::https://github.com/poseidon/typhoon (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.11.0 (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.12.0 (update)
```
Plan the resources to be created.
@ -185,9 +184,9 @@ In 4-8 minutes, the Kubernetes cluster will be ready.
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
yavin-controller-0.c.example-com.internal Ready 6m v1.9.4
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.9.4
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.9.4
yavin-controller-0.c.example-com.internal Ready 6m v1.10.1
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.10.1
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.10.1
```
List the pods.
@ -227,8 +226,7 @@ Learn about [maintenance](topics/maintenance.md) and [addons](addons/overview.md
| region | Google Cloud region | "us-central1" |
| dns_zone | Google Cloud DNS zone | "google-cloud.example.com" |
| dns_zone_name | Google Cloud DNS zone name | "example-zone" |
| ssh_authorized_key | SSH public key for ~/.ssh_authorized_keys | "ssh-rsa AAAAB3NZ..." |
| os_image | OS image for compute instances | "coreos-stable" |
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." |
| asset_dir | Path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/yavin" |
Check the list of valid [regions](https://cloud.google.com/compute/docs/regions-zones/regions-zones) and list Container Linux [images](https://cloud.google.com/compute/docs/images) with `gcloud compute images list | grep coreos`.
@ -254,13 +252,18 @@ resource "google_dns_managed_zone" "zone-for-clusters" {
| Name | Description | Default | Example |
|:-----|:------------|:--------|:--------|
| machine_type | Machine type for compute instances | "n1-standard-1" | See below |
| controller_count | Number of controllers (i.e. masters) | 1 | 1 |
| worker_count | Number of workers | 1 | 3 |
| controller_type | Machine type for controllers | "n1-standard-1" | See below |
| worker_type | Machine type for workers | "n1-standard-1" | See below |
| os_image | Container Linux image for compute instances | "coreos-stable" | "coreos-stable-1632-3-0-v20180215" |
| disk_size | Size of the disk in GB | 40 | 100 |
| worker_preemptible | If enabled, Compute Engine will terminate workers randomly within 24 hours | false | true |
| controller_clc_snippets | Controller Container Linux Config snippets | [] | |
| worker_clc_snippets | Worker Container Linux Config snippets | [] | |
| networking | Choice of networking provider | "calico" | "calico" or "flannel" |
| pod_cidr | CIDR range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by kube-dns. | "cluster.local" | "k8s.example.com" |
Check the list of valid [machine types](https://cloud.google.com/compute/docs/machine-types).

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.9.4 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Kubernetes v1.10.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/) and [preemption](https://typhoon.psdn.io/google-cloud/#preemption) (varies by platform)
@ -44,29 +44,28 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo
```tf
module "google-cloud-yavin" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.9.4"
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.10.1"
providers = {
google = "google.default"
local = "local.default"
null = "null.default"
google = "google.default"
local = "local.default"
null = "null.default"
template = "template.default"
tls = "tls.default"
tls = "tls.default"
}
# Google Cloud
cluster_name = "yavin"
region = "us-central1"
dns_zone = "example.com"
dns_zone_name = "example-zone"
os_image = "coreos-stable"
cluster_name = "yavin"
controller_count = 1
worker_count = 2
# configuration
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
# output assets dir
asset_dir = "/home/user/.secrets/clusters/yavin"
asset_dir = "/home/user/.secrets/clusters/yavin"
# optional
worker_count = 2
}
```
@ -86,9 +85,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
yavin-controller-0.c.example-com.internal Ready 6m v1.9.4
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.9.4
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.9.4
yavin-controller-0.c.example-com.internal Ready 6m v1.10.1
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.10.1
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.10.1
```
List the pods.

View File

@ -162,14 +162,14 @@ show ip bgp neighbors
show ip route bgp
```
Be sure to register the peer by creating a Calico `bgpPeer` CRD with `kubectl apply`.
Be sure to register the peer by creating a Calico `BGPPeer` CRD with `kubectl apply`.
```
apiVersion: v1
kind: bgpPeer
apiVersion: crd.projectcalico.org/v1
kind: BGPPeer
metadata:
peerIP: LAN_IP
scope: global
name: NAME
spec:
peerIP: LAN_IP
asNumber: 64512
```

View File

@ -18,7 +18,7 @@ module "google-cloud-yavin" {
}
module "bare-metal-mercury" {
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.9.4"
source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.10.1"
...
}
```
@ -205,7 +205,7 @@ You should now be able to run `terraform plan` without errors. When you choose,
## terraform-provider-ct v0.2.1
Typhoon recommends updating the [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) plugin installed on your system from v0.2.0 to [v0.2.1](https://github.com/coreos/terraform-provider-ct/releases/tag/v0.2.1). The release contains an important feature that will be used in future Typhoon releases.
Typhoon requires updating the [terraform-provider-ct](https://github.com/coreos/terraform-provider-ct) plugin installed on your system from v0.2.0 to [v0.2.1](https://github.com/coreos/terraform-provider-ct/releases/tag/v0.2.1).
Check your `~/.terraformrc` to find your current `terraform-provider-ct` plugin.
@ -236,4 +236,3 @@ Verify Terraform does not produce a diff related to Container Linux provisioning
terraform plan
```
You're prepared for future Typhoon releases.

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.9.4 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Kubernetes v1.10.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)

View File

@ -17,7 +17,7 @@ resource "google_dns_record_set" "controllers" {
rrdatas = ["${google_compute_address.controllers-ip.address}"]
}
# Network Load Balancer (i.e. forwarding rule)
# Network Load Balancer for controllers
resource "google_compute_forwarding_rule" "controller-https-rule" {
name = "${var.cluster_name}-controller-https-rule"
ip_address = "${google_compute_address.controllers-ip.address}"

View File

@ -1,10 +1,10 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=c5fc93d95fe4993511656cdd6372afbd1307f08f"
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=db36b92abced3c4b0af279adfd5ed4bf0cf8c39f"
cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
etcd_servers = "${module.controllers.etcd_fqdns}"
etcd_servers = ["${null_resource.repeat.*.triggers.domain}"]
asset_dir = "${var.asset_dir}"
networking = "${var.networking}"
network_mtu = 1440

View File

@ -7,12 +7,13 @@ systemd:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.3.2"
Environment="ETCD_IMAGE_TAG=v3.3.3"
Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
Environment="ETCD_LISTEN_CLIENT_URLS=https://0.0.0.0:2379"
Environment="ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380"
Environment="ETCD_LISTEN_METRICS_URLS=http://0.0.0.0:2381"
Environment="ETCD_INITIAL_CLUSTER=${etcd_initial_cluster}"
Environment="ETCD_STRICT_RECONFIG_CHECK=true"
Environment="ETCD_SSL_DIR=/etc/ssl/etcd"
@ -82,6 +83,7 @@ systemd:
--lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni \
--node-labels=node-role.kubernetes.io/master \
--node-labels=node-role.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
@ -116,8 +118,8 @@ storage:
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.4
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.10.1
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -138,7 +140,7 @@ storage:
# Move experimental manifests
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.11.0}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.12.0}"
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
exec /usr/bin/rkt run \
--trust-keys-from-https \

View File

@ -1,40 +0,0 @@
module "controllers" {
source = "controllers"
cluster_name = "${var.cluster_name}"
# GCE
region = "${var.region}"
network = "${google_compute_network.network.name}"
dns_zone = "${var.dns_zone}"
dns_zone_name = "${var.dns_zone_name}"
count = "${var.controller_count}"
machine_type = "${var.machine_type}"
os_image = "${var.os_image}"
# configuration
networking = "${var.networking}"
kubeconfig = "${module.bootkube.kubeconfig}"
ssh_authorized_key = "${var.ssh_authorized_key}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
}
module "workers" {
source = "workers"
name = "${var.cluster_name}"
cluster_name = "${var.cluster_name}"
# GCE
region = "${var.region}"
network = "${google_compute_network.network.name}"
count = "${var.worker_count}"
machine_type = "${var.machine_type}"
os_image = "${var.os_image}"
preemptible = "${var.worker_preemptible}"
# configuration
kubeconfig = "${module.bootkube.kubeconfig}"
ssh_authorized_key = "${var.ssh_authorized_key}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
}

View File

@ -1,6 +1,6 @@
# Discrete DNS records for each controller's private IPv4 for etcd usage
resource "google_dns_record_set" "etcds" {
count = "${var.count}"
count = "${var.controller_count}"
# DNS Zone name where record should be created
managed_zone = "${var.dns_zone_name}"
@ -21,11 +21,11 @@ data "google_compute_zones" "all" {
# Controller instances
resource "google_compute_instance" "controllers" {
count = "${var.count}"
count = "${var.controller_count}"
name = "${var.cluster_name}-controller-${count.index}"
zone = "${element(data.google_compute_zones.all.names, count.index)}"
machine_type = "${var.machine_type}"
machine_type = "${var.controller_type}"
metadata {
user-data = "${element(data.ct_config.controller_ign.*.rendered, count.index)}"
@ -41,7 +41,7 @@ resource "google_compute_instance" "controllers" {
}
network_interface {
network = "${var.network}"
network = "${google_compute_network.network.name}"
# Ephemeral external IP
access_config = {}
@ -51,9 +51,13 @@ resource "google_compute_instance" "controllers" {
tags = ["${var.cluster_name}-controller"]
}
locals {
controllers_ipv4_public = ["${google_compute_instance.controllers.*.network_interface.0.access_config.0.assigned_nat_ip}"]
}
# Controller Container Linux Config
data "template_file" "controller_config" {
count = "${var.count}"
count = "${var.controller_count}"
template = "${file("${path.module}/cl/controller.yaml.tmpl")}"
@ -65,17 +69,17 @@ data "template_file" "controller_config" {
# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...
etcd_initial_cluster = "${join(",", formatlist("%s=https://%s:2380", null_resource.repeat.*.triggers.name, null_resource.repeat.*.triggers.domain))}"
kubeconfig = "${indent(10, module.bootkube.kubeconfig)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
ssh_authorized_key = "${var.ssh_authorized_key}"
kubeconfig = "${indent(10, var.kubeconfig)}"
}
}
# Horrible hack to generate a Terraform list of a desired length without dependencies.
# Ideal ${repeat("etcd", 3) -> ["etcd", "etcd", "etcd"]}
resource null_resource "repeat" {
count = "${var.count}"
count = "${var.controller_count}"
triggers {
name = "etcd${count.index}"
@ -84,7 +88,8 @@ resource null_resource "repeat" {
}
data "ct_config" "controller_ign" {
count = "${var.count}"
count = "${var.controller_count}"
content = "${element(data.template_file.controller_config.*.rendered, count.index)}"
pretty_print = false
snippets = ["${var.controller_clc_snippets}"]
}

View File

@ -1,7 +0,0 @@
output "etcd_fqdns" {
value = ["${null_resource.repeat.*.triggers.domain}"]
}
output "ipv4_public" {
value = ["${google_compute_instance.controllers.*.network_interface.0.access_config.0.assigned_nat_ip}"]
}

View File

@ -1,81 +0,0 @@
variable "cluster_name" {
type = "string"
description = "Unique cluster name"
}
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key for logging in as user 'core'"
}
variable "network" {
type = "string"
description = "Name of the network to attach to the compute instance interfaces"
}
variable "dns_zone" {
type = "string"
description = "Google Cloud DNS Zone value to create etcd/k8s subdomains (e.g. dghubble.io)"
}
variable "dns_zone_name" {
type = "string"
description = "Google Cloud DNS Zone name to create etcd/k8s subdomains (e.g. dghubble-io)"
}
# instances
variable "count" {
type = "string"
description = "Number of controller compute instances the instance group should manage"
}
variable "region" {
type = "string"
description = "Google Cloud region (e.g. us-central1, see `gcloud compute regions list`)."
}
variable "machine_type" {
type = "string"
description = "Machine type for compute instances (e.g. gcloud compute machine-types list)"
}
variable "os_image" {
type = "string"
description = "OS image from which to initialize the disk (e.g. gcloud compute images list)"
}
variable "disk_size" {
type = "string"
default = "40"
description = "The size of the disk in gigabytes."
}
// configuration
variable "networking" {
description = "Choice of networking provider (flannel or calico)"
type = "string"
default = "flannel"
}
variable "service_cidr" {
description = <<EOD
CIDR IP range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
EOD
type = "string"
default = "10.3.0.0/16"
}
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by kube-dns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = "string"
default = "cluster.local"
}
variable "kubeconfig" {
type = "string"
description = "Generated Kubelet kubeconfig"
}

View File

@ -56,6 +56,20 @@ resource "google_compute_firewall" "internal-etcd" {
target_tags = ["${var.cluster_name}-controller"]
}
# Allow Prometheus to scrape etcd metrics
resource "google_compute_firewall" "internal-etcd-metrics" {
name = "${var.cluster_name}-internal-etcd-metrics"
network = "${google_compute_network.network.name}"
allow {
protocol = "tcp"
ports = [2381]
}
source_tags = ["${var.cluster_name}-worker"]
target_tags = ["${var.cluster_name}-controller"]
}
# Calico BGP and IPIP
# https://docs.projectcalico.org/v2.5/reference/public-cloud/gce
resource "google_compute_firewall" "internal-calico" {
@ -93,7 +107,7 @@ resource "google_compute_firewall" "internal-flannel" {
target_tags = ["${var.cluster_name}-controller", "${var.cluster_name}-worker"]
}
# Allow prometheus (workload) to scrape node-exporter daemonset
# Allow Prometheus to scrape node-exporter daemonset
resource "google_compute_firewall" "internal-node-exporter" {
name = "${var.cluster_name}-internal-node-exporter"
network = "${google_compute_network.network.name}"

View File

@ -1,19 +1,22 @@
# Deprecated
output "controllers_ipv4_public" {
value = ["${module.controllers.ipv4_public}"]
value = ["${google_compute_instance.controllers.*.network_interface.0.access_config.0.assigned_nat_ip}"]
}
output "ingress_static_ip" {
value = "${module.workers.ingress_static_ip}"
}
output "network_name" {
value = "${google_compute_network.network.name}"
}
output "network_self_link" {
value = "${google_compute_network.network.self_link}"
}
# Outputs for worker pools
output "network_name" {
value = "${google_compute_network.network.name}"
}
output "kubeconfig" {
value = "${module.bootkube.kubeconfig}"
}

View File

@ -1,20 +1,14 @@
# Secure copy etcd TLS assets and kubeconfig to controllers. Activates kubelet.service
resource "null_resource" "copy-secrets" {
depends_on = ["module.bootkube"]
count = "${var.controller_count}"
# Secure copy etcd TLS assets to controllers.
resource "null_resource" "copy-controller-secrets" {
count = "${var.controller_count}"
connection {
type = "ssh"
host = "${element(module.controllers.ipv4_public, count.index)}"
host = "${element(local.controllers_ipv4_public, count.index)}"
user = "core"
timeout = "15m"
}
provisioner "file" {
content = "${module.bootkube.kubeconfig}"
destination = "$HOME/kubeconfig"
}
provisioner "file" {
content = "${module.bootkube.etcd_ca_cert}"
destination = "$HOME/etcd-client-ca.crt"
@ -62,7 +56,6 @@ resource "null_resource" "copy-secrets" {
"sudo mv etcd-peer.key /etc/ssl/etcd/etcd/peer.key",
"sudo chown -R etcd:etcd /etc/ssl/etcd",
"sudo chmod -R 500 /etc/ssl/etcd",
"sudo mv /home/core/kubeconfig /etc/kubernetes/kubeconfig",
]
}
}
@ -70,11 +63,16 @@ resource "null_resource" "copy-secrets" {
# Secure copy bootkube assets to ONE controller and start bootkube to perform
# one-time self-hosted cluster bootstrapping.
resource "null_resource" "bootkube-start" {
depends_on = ["module.controllers", "module.bootkube", "module.workers", "null_resource.copy-secrets"]
depends_on = [
"module.bootkube",
"module.workers",
"google_dns_record_set.controllers",
"null_resource.copy-controller-secrets",
]
connection {
type = "ssh"
host = "${element(module.controllers.ipv4_public, 0)}"
host = "${element(local.controllers_ipv4_public, 0)}"
user = "core"
timeout = "15m"
}
@ -86,7 +84,7 @@ resource "null_resource" "bootkube-start" {
provisioner "remote-exec" {
inline = [
"sudo mv /home/core/assets /opt/bootkube",
"sudo mv $HOME/assets /opt/bootkube",
"sudo systemctl start bootkube",
]
}

View File

@ -1,8 +1,10 @@
variable "cluster_name" {
type = "string"
description = "Cluster name"
description = "Unique cluster name (prepended to dns_zone)"
}
# Google Cloud
variable "region" {
type = "string"
description = "Google Cloud Region (e.g. us-central1, see `gcloud compute regions list`)"
@ -10,34 +12,20 @@ variable "region" {
variable "dns_zone" {
type = "string"
description = "Google Cloud DNS Zone (e.g. google-cloud.dghubble.io)"
description = "Google Cloud DNS Zone (e.g. google-cloud.example.com)"
}
variable "dns_zone_name" {
type = "string"
description = "Google Cloud DNS Zone name (e.g. google-cloud-prod-zone)"
description = "Google Cloud DNS Zone name (e.g. example-zone)"
}
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key for user 'core'"
}
variable "machine_type" {
type = "string"
default = "n1-standard-1"
description = "Machine type for compute instances (see `gcloud compute machine-types list`)"
}
variable "os_image" {
type = "string"
description = "OS image from which to initialize the disk (see `gcloud compute images list`)"
}
# instances
variable "controller_count" {
type = "string"
default = "1"
description = "Number of controllers"
description = "Number of controllers (i.e. masters)"
}
variable "worker_count" {
@ -46,13 +34,54 @@ variable "worker_count" {
description = "Number of workers"
}
variable "controller_type" {
type = "string"
default = "n1-standard-1"
description = "Machine type for controllers (see `gcloud compute machine-types list`)"
}
variable "worker_type" {
type = "string"
default = "n1-standard-1"
description = "Machine type for controllers (see `gcloud compute machine-types list`)"
}
variable "os_image" {
type = "string"
default = "coreos-stable"
description = "Container Linux image for compute instances (e.g. coreos-stable)"
}
variable "disk_size" {
type = "string"
default = "40"
description = "Size of the disk in GB"
}
variable "worker_preemptible" {
type = "string"
default = "false"
description = "If enabled, Compute Engine will terminate workers randomly within 24 hours"
}
# bootkube assets
variable "controller_clc_snippets" {
type = "list"
description = "Controller Container Linux Config snippets"
default = []
}
variable "worker_clc_snippets" {
type = "list"
description = "Worker Container Linux Config snippets"
default = []
}
# configuration
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key for user 'core'"
}
variable "asset_dir" {
description = "Path to a directory where generated assets should be placed (contains secrets)"
@ -66,14 +95,14 @@ variable "networking" {
}
variable "pod_cidr" {
description = "CIDR IP range to assign Kubernetes pods"
description = "CIDR IPv4 range to assign Kubernetes pods"
type = "string"
default = "10.2.0.0/16"
}
variable "service_cidr" {
description = <<EOD
CIDR IP range to assign Kubernetes services.
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
EOD

View File

@ -0,0 +1,21 @@
module "workers" {
source = "workers"
name = "${var.cluster_name}"
cluster_name = "${var.cluster_name}"
# GCE
region = "${var.region}"
network = "${google_compute_network.network.name}"
count = "${var.worker_count}"
machine_type = "${var.worker_type}"
os_image = "${var.os_image}"
disk_size = "${var.disk_size}"
preemptible = "${var.worker_preemptible}"
# configuration
kubeconfig = "${module.bootkube.kubeconfig}"
ssh_authorized_key = "${var.ssh_authorized_key}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
clc_snippets = "${var.worker_clc_snippets}"
}

View File

@ -40,8 +40,6 @@ systemd:
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
@ -90,8 +88,8 @@ storage:
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
KUBELET_IMAGE_TAG=v1.9.4
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.10.1
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -109,7 +107,7 @@ storage:
--volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \
--insecure-options=image \
docker://gcr.io/google_containers/hyperkube:v1.9.4 \
docker://k8s.gcr.io/hyperkube:v1.10.1 \
--net=host \
--dns=host \
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)

View File

@ -1,21 +1,23 @@
variable "name" {
type = "string"
description = "Unique name for instance group"
description = "Unique name for the worker pool"
}
variable "cluster_name" {
type = "string"
description = "Cluster name"
description = "Must be set to `cluster_name of cluster`"
}
# Google Cloud
variable "region" {
type = "string"
description = "Google Cloud region (e.g. us-central1, see `gcloud compute regions list`)."
description = "Must be set to `region` of cluster"
}
variable "network" {
type = "string"
description = "Name of the network to attach to the compute instance interfaces"
description = "Must be set to `network_name` output by cluster"
}
# instances
@ -35,13 +37,13 @@ variable "machine_type" {
variable "os_image" {
type = "string"
default = "coreos-stable"
description = "OS image from which to initialize the disk (e.g. gcloud compute images list)"
description = "Container Linux image for compute instanges (e.g. gcloud compute images list)"
}
variable "disk_size" {
type = "string"
default = "40"
description = "The size of the disk in gigabytes."
description = "Size of the disk in GB"
}
variable "preemptible" {
@ -54,17 +56,17 @@ variable "preemptible" {
variable "kubeconfig" {
type = "string"
description = "Generated Kubelet kubeconfig"
description = "Must be set to `kubeconfig` output by cluster"
}
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key for logging in as user 'core'"
description = "SSH public key for user 'core'"
}
variable "service_cidr" {
description = <<EOD
CIDR IP range to assign Kubernetes services.
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
EOD
@ -78,16 +80,22 @@ variable "cluster_domain_suffix" {
default = "cluster.local"
}
variable "clc_snippets" {
type = "list"
description = "Container Linux Config snippets"
default = []
}
# unofficial, undocumented, unsupported, temporary
variable "accelerator_type" {
type = "string"
default = ""
type = "string"
default = ""
description = "Google Compute Engine accelerator type (e.g. nvidia-tesla-k80, see gcloud compute accelerator-types list)"
}
variable "accelerator_count" {
type = "string"
default = "0"
type = "string"
default = "0"
description = "Number of compute engine accelerators"
}

View File

@ -32,6 +32,7 @@ data "template_file" "worker_config" {
data "ct_config" "worker_ign" {
content = "${data.template_file.worker_config.rendered}"
pretty_print = false
snippets = ["${var.clc_snippets}"]
}
resource "google_compute_instance_template" "worker" {
@ -63,11 +64,11 @@ resource "google_compute_instance_template" "worker" {
}
can_ip_forward = true
tags = ["worker", "${var.cluster_name}-worker", "${var.name}-worker"]
tags = ["worker", "${var.cluster_name}-worker", "${var.name}-worker"]
guest_accelerator {
count = "${var.accelerator_count}"
type = "${var.accelerator_type}"
type = "${var.accelerator_type}"
}
lifecycle {

View File

@ -1,5 +1,5 @@
mkdocs==0.17.2
mkdocs-material==2.2.6
mkdocs==0.17.3
mkdocs-material==2.7.1
pygments==2.2.0
pymdown-extensions==3.5
six==1.10.0