Update Kubernetes from v1.10.0 to v1.10.1

* Use kubernetes-incubator/bootkube v0.12.0
Refactor GCP to remove controller internal module
2025-08-03 09:01:33 +02:00 · 2018-04-12 20:57:31 -07:00 · 2018-04-12 19:41:51 -07:00 · 2018-04-11 22:23:51 -07:00 · 2018-04-11 22:19:58 -07:00 · 2018-04-09 23:23:18 -05:00
48 changed files with 238 additions and 294 deletions
--- a/CHANGES.md
+++ b/CHANGES.md
@ -4,6 +4,33 @@ Notable changes between versions.

 ## Latest

+* Kubernetes [v1.10.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.10.md#v1101)
+* Enable etcd v3.3 metrics endpoint ([#175](https://github.com/poseidon/typhoon/pull/175))
+* Use `k8s.gcr.io` instead of `gcr.io/google_containers` ([#180](https://github.com/poseidon/typhoon/pull/180))
+  * Kubernetes [recommends](https://groups.google.com/forum/#!msg/kubernetes-dev/ytjk_rNrTa0/3EFUHvovCAAJ) using the alias to pull from the nearest regional mirror and to abstract the backing container registry
+* Update kube-dns from v1.14.8 to v1.14.9
+* Update etcd from v3.3.2 to v3.3.3
+* Use kubernetes-incubator/bootkube v0.12.0
+
+#### Bare-Metal
+
+* Fix need for multiple `terraform apply` runs to create a cluster with Terraform v0.11.4 ([#181](https://github.com/poseidon/typhoon/pull/181))
+  * To SSH during a disk install for debugging, SSH as user "core" with port 2222
+  * Remove the old trick of using a user "debug" during disk install
+
+#### Google Cloud
+
+* Refactor out the `controller` internal module
+
+#### Addons
+
+* Add Prometheus discovery for etcd peers on controller nodes ([#175](https://github.com/poseidon/typhoon/pull/175))
+  * Scrape etcd v3.3 `--listen-metrics-urls` for metrics
+  * Enable etcd alerts and populate the etcd Grafana dashboard
+* Update kube-state-metrics from v1.2.0 to v1.3.0
+
+## v1.10.0
+
 * Kubernetes [v1.10.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.10.md#v1100)
 * Remove unused, unmaintained `pxe-worker` internal module

@ -77,7 +104,7 @@ Notable changes between versions.
 * Allow flexvolume plugins to be used on any Typhoon cluster (not just bare-metal)
 * Upgrade etcd from v3.2.15 to v3.3.2
 * Update Calico from v3.0.2 to v3.0.3
-* Use kubernetes-incubator/bootkube v0.10.0
+* Use kubernetes-incubator/bootkube v0.11.0
 * [Recommend](https://typhoon.psdn.io/topics/maintenance/#terraform-provider-ct-v021) updating `terraform-provider-ct` plugin from v0.2.0 to [v0.2.1](https://github.com/coreos/terraform-provider-ct/releases/tag/v0.2.1) (action recommended)

 #### AWS
--- a/README.md
+++ b/README.md
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster

 ## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>

-* Kubernetes v1.10.0 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
+* Kubernetes v1.10.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
 * Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
 * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
 * Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/) and [preemption](https://typhoon.psdn.io/google-cloud/#preemption) (varies by platform)
@ -44,7 +44,7 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo

 ```tf
 module "google-cloud-yavin" {
-  source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.10.0"
+  source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.10.1"
  
  providers = {
    google   = "google.default"
@ -86,9 +86,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
 $ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
 $ kubectl get nodes
 NAME                                          STATUS   AGE    VERSION
-yavin-controller-0.c.example-com.internal     Ready    6m     v1.10.0
-yavin-worker-jrbf.c.example-com.internal      Ready    5m     v1.10.0
-yavin-worker-mzdm.c.example-com.internal      Ready    5m     v1.10.0
+yavin-controller-0.c.example-com.internal     Ready    6m     v1.10.1
+yavin-worker-jrbf.c.example-com.internal      Ready    5m     v1.10.1
+yavin-worker-mzdm.c.example-com.internal      Ready    5m     v1.10.1
 ```

 List the pods.
--- a/addons/nginx-ingress/aws/default-backend/deployment.yaml
+++ b/addons/nginx-ingress/aws/default-backend/deployment.yaml
@ -20,7 +20,7 @@ spec:
          # Any image is permissable as long as:
          # 1. It serves a 404 page at /
          # 2. It serves 200 on a /healthz endpoint
-          image: gcr.io/google_containers/defaultbackend:1.4
+          image: k8s.gcr.io/defaultbackend:1.4
          ports:
            - containerPort: 8080
          resources:
--- a/addons/nginx-ingress/digital-ocean/default-backend/deployment.yaml
+++ b/addons/nginx-ingress/digital-ocean/default-backend/deployment.yaml
@ -20,7 +20,7 @@ spec:
          # Any image is permissable as long as:
          # 1. It serves a 404 page at /
          # 2. It serves 200 on a /healthz endpoint
-          image: gcr.io/google_containers/defaultbackend:1.4
+          image: k8s.gcr.io/defaultbackend:1.4
          ports:
            - containerPort: 8080
          resources:
--- a/addons/nginx-ingress/google-cloud/default-backend/deployment.yaml
+++ b/addons/nginx-ingress/google-cloud/default-backend/deployment.yaml
@ -20,7 +20,7 @@ spec:
          # Any image is permissable as long as:
          # 1. It serves a 404 page at /
          # 2. It serves 200 on a /healthz endpoint
-          image: gcr.io/google_containers/defaultbackend:1.4
+          image: k8s.gcr.io/defaultbackend:1.4
          ports:
            - containerPort: 8080
          resources:
--- a/addons/prometheus/config.yaml
+++ b/addons/prometheus/config.yaml
@ -112,6 +112,22 @@ data:
        target_label: __metrics_path__
        replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
    
+    # Scrap etcd metrics from controllers 
+    - job_name: 'etcd'
+      kubernetes_sd_configs:
+      - role: node
+      scheme: http
+      relabel_configs:
+        - source_labels: [__meta_kubernetes_node_label_node_role_kubernetes_io_controller]
+          action: keep
+          regex: 'true'
+        - action: labelmap
+          regex: __meta_kubernetes_node_label_(.+)
+        - source_labels: [__meta_kubernetes_node_name]
+          action: replace
+          target_label: __address__
+          replacement: '${1}:2381'
+    
    # Scrape config for service endpoints.
    #
    # The relabeling allows the actual service scrape endpoint to be configured
--- a/addons/prometheus/exporters/kube-state-metrics/cluster-role.yaml
+++ b/addons/prometheus/exporters/kube-state-metrics/cluster-role.yaml
@ -5,6 +5,8 @@ metadata:
 rules:
 - apiGroups: [""]
  resources:
+  - configmaps
+  - secrets
  - nodes
  - pods
  - services
--- a/addons/prometheus/exporters/kube-state-metrics/deployment.yaml
+++ b/addons/prometheus/exporters/kube-state-metrics/deployment.yaml
@ -22,7 +22,7 @@ spec:
      serviceAccountName: kube-state-metrics
      containers:
      - name: kube-state-metrics
-        image: quay.io/coreos/kube-state-metrics:v1.2.0
+        image: quay.io/coreos/kube-state-metrics:v1.3.0
        ports:
          - name: metrics
            containerPort: 8080
@ -33,7 +33,7 @@ spec:
          initialDelaySeconds: 5
          timeoutSeconds: 5
      - name: addon-resizer
-        image: gcr.io/google_containers/addon-resizer:1.7
+        image: k8s.gcr.io/addon-resizer:1.7
        resources:
          limits:
            cpu: 100m
--- a/addons/prometheus/rules.yaml
+++ b/addons/prometheus/rules.yaml
@ -63,26 +63,6 @@ data:
          description: etcd instance {{ $labels.instance }} has seen {{ $value }} leader
            changes within the last hour
          summary: a high number of leader changes within the etcd cluster are happening
-      - alert: HighNumberOfFailedGRPCRequests
-        expr: sum(rate(grpc_server_handled_total{grpc_code!="OK",job="etcd"}[5m])) BY (grpc_service, grpc_method)
-          / sum(rate(grpc_server_handled_total{job="etcd"}[5m])) BY (grpc_service, grpc_method) > 0.01
-        for: 10m
-        labels:
-          severity: warning
-        annotations:
-          description: '{{ $value }}% of requests for {{ $labels.grpc_method }} failed
-            on etcd instance {{ $labels.instance }}'
-          summary: a high number of gRPC requests are failing
-      - alert: HighNumberOfFailedGRPCRequests
-        expr: sum(rate(grpc_server_handled_total{grpc_code!="OK",job="etcd"}[5m])) BY (grpc_service, grpc_method)
-          / sum(rate(grpc_server_handled_total{job="etcd"}[5m])) BY (grpc_service, grpc_method) > 0.05
-        for: 5m
-        labels:
-          severity: critical
-        annotations:
-          description: '{{ $value }}% of requests for {{ $labels.grpc_method }} failed
-            on etcd instance {{ $labels.instance }}'
-          summary: a high number of gRPC requests are failing
      - alert: GRPCRequestsSlow
        expr: histogram_quantile(0.99, sum(rate(grpc_server_handling_seconds_bucket{job="etcd",grpc_type="unary"}[5m])) by (grpc_service, grpc_method, le))
          > 0.15
--- a/aws/container-linux/kubernetes/README.md
+++ b/aws/container-linux/kubernetes/README.md
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster

 ## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>

-* Kubernetes v1.10.0 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
+* Kubernetes v1.10.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
 * Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
 * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
 * Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/)
--- a/aws/container-linux/kubernetes/bootkube.tf
+++ b/aws/container-linux/kubernetes/bootkube.tf
@ -1,6 +1,6 @@
 # Self-hosted Kubernetes assets (kubeconfig, manifests)
 module "bootkube" {
-  source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=5f3546b66ffb9946b36e612537bb6a1830ae7746"
+  source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=db36b92abced3c4b0af279adfd5ed4bf0cf8c39f"

  cluster_name          = "${var.cluster_name}"
  api_servers           = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
--- a/aws/container-linux/kubernetes/cl/controller.yaml.tmpl
+++ b/aws/container-linux/kubernetes/cl/controller.yaml.tmpl
@ -7,12 +7,13 @@ systemd:
        - name: 40-etcd-cluster.conf
          contents: |
            [Service]
-            Environment="ETCD_IMAGE_TAG=v3.3.2"
+            Environment="ETCD_IMAGE_TAG=v3.3.3"
            Environment="ETCD_NAME=${etcd_name}"
            Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
            Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
            Environment="ETCD_LISTEN_CLIENT_URLS=https://0.0.0.0:2379"
            Environment="ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380"
+            Environment="ETCD_LISTEN_METRICS_URLS=http://0.0.0.0:2381"
            Environment="ETCD_INITIAL_CLUSTER=${etcd_initial_cluster}"
            Environment="ETCD_STRICT_RECONFIG_CHECK=true"
            Environment="ETCD_SSL_DIR=/etc/ssl/etcd"
@ -116,8 +117,8 @@ storage:
      mode: 0644
      contents:
        inline: |
-          KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
-          KUBELET_IMAGE_TAG=v1.10.0
+          KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
+          KUBELET_IMAGE_TAG=v1.10.1
    - path: /etc/sysctl.d/max-user-watches.conf
      filesystem: root
      contents:
@ -138,7 +139,7 @@ storage:
          # Move experimental manifests
          [ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
          BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
-          BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.11.0}"
+          BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.12.0}"
          BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
          exec /usr/bin/rkt run \
            --trust-keys-from-https \
--- a/aws/container-linux/kubernetes/security.tf
+++ b/aws/container-linux/kubernetes/security.tf
@ -51,6 +51,16 @@ resource "aws_security_group_rule" "controller-etcd" {
  self      = true
 }

+resource "aws_security_group_rule" "controller-etcd-metrics" {
+  security_group_id = "${aws_security_group.controller.id}"
+
+  type                     = "ingress"
+  protocol                 = "tcp"
+  from_port                = 2381
+  to_port                  = 2381
+  source_security_group_id = "${aws_security_group.worker.id}"
+}
+
 resource "aws_security_group_rule" "controller-flannel" {
  security_group_id = "${aws_security_group.controller.id}"

--- a/aws/container-linux/kubernetes/workers/cl/worker.yaml.tmpl
+++ b/aws/container-linux/kubernetes/workers/cl/worker.yaml.tmpl
@ -39,8 +39,6 @@ systemd:
        ExecStartPre=/bin/mkdir -p /opt/cni/bin
        ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
        ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
-        ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
-        ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
        ExecStartPre=/bin/mkdir -p /var/lib/cni
        ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
        ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
@ -89,8 +87,8 @@ storage:
      mode: 0644
      contents:
        inline: |
-          KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
-          KUBELET_IMAGE_TAG=v1.10.0
+          KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
+          KUBELET_IMAGE_TAG=v1.10.1
    - path: /etc/sysctl.d/max-user-watches.conf
      filesystem: root
      contents:
@ -108,7 +106,7 @@ storage:
            --volume config,kind=host,source=/etc/kubernetes \
            --mount volume=config,target=/etc/kubernetes \
            --insecure-options=image \
-            docker://gcr.io/google_containers/hyperkube:v1.10.0 \
+            docker://k8s.gcr.io/hyperkube:v1.10.1 \
            --net=host \
            --dns=host \
            --exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)
--- a/bare-metal/container-linux/kubernetes/README.md
+++ b/bare-metal/container-linux/kubernetes/README.md
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster

 ## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>

-* Kubernetes v1.10.0 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
+* Kubernetes v1.10.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
 * Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
 * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
 * Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
--- a/bare-metal/container-linux/kubernetes/bootkube.tf
+++ b/bare-metal/container-linux/kubernetes/bootkube.tf
@ -1,6 +1,6 @@
 # Self-hosted Kubernetes assets (kubeconfig, manifests)
 module "bootkube" {
-  source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=5f3546b66ffb9946b36e612537bb6a1830ae7746"
+  source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=db36b92abced3c4b0af279adfd5ed4bf0cf8c39f"

  cluster_name          = "${var.cluster_name}"
  api_servers           = ["${var.k8s_domain_name}"]
--- a/bare-metal/container-linux/kubernetes/cl/container-linux-install.yaml.tmpl
+++ b/bare-metal/container-linux/kubernetes/cl/container-linux-install.yaml.tmpl
@ -12,6 +12,16 @@ systemd:
        ExecStart=/opt/installer
        [Install]
        WantedBy=multi-user.target
+    # Avoid using the standard SSH port so terraform apply cannot SSH until
+    # post-install. But admins may SSH to debug disk install problems.
+    # After install, sshd will use port 22 and users/terraform can connect.
+    - name: sshd.socket
+      dropins:
+        - name: 10-sshd-port.conf
+          contents: |
+            [Socket]
+            ListenStream=
+            ListenStream=2222
 storage:
  files:
    - path: /opt/installer
@ -32,11 +42,6 @@ storage:
          systemctl reboot
 passwd:
  users:
-    # Avoid using standard name "core" so terraform apply cannot SSH until post-install.
-    - name: debug
-      create:
-        groups:
-          - sudo
-          - docker
+    - name: core
      ssh_authorized_keys:
-        - {{.ssh_authorized_key}}
+        - "${ssh_authorized_key}"
--- a/bare-metal/container-linux/kubernetes/cl/controller.yaml.tmpl
+++ b/bare-metal/container-linux/kubernetes/cl/controller.yaml.tmpl
@ -7,12 +7,13 @@ systemd:
        - name: 40-etcd-cluster.conf
          contents: |
            [Service]
-            Environment="ETCD_IMAGE_TAG=v3.3.2"
+            Environment="ETCD_IMAGE_TAG=v3.3.3"
            Environment="ETCD_NAME=${etcd_name}"
            Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${domain_name}:2379"
            Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${domain_name}:2380"
            Environment="ETCD_LISTEN_CLIENT_URLS=https://0.0.0.0:2379"
            Environment="ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380"
+            Environment="ETCD_LISTEN_METRICS_URLS=http://0.0.0.0:2381"
            Environment="ETCD_INITIAL_CLUSTER=${etcd_initial_cluster}"
            Environment="ETCD_STRICT_RECONFIG_CHECK=true"
            Environment="ETCD_SSL_DIR=/etc/ssl/etcd"
@ -117,8 +118,8 @@ storage:
      mode: 0644
      contents:
        inline: |
-          KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
-          KUBELET_IMAGE_TAG=v1.10.0
+          KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
+          KUBELET_IMAGE_TAG=v1.10.1
    - path: /etc/hostname
      filesystem: root
      mode: 0644
@ -145,7 +146,7 @@ storage:
          # Move experimental manifests
          [ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
          BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
-          BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.11.0}"
+          BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.12.0}"
          BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
          exec /usr/bin/rkt run \
            --trust-keys-from-https \
--- a/bare-metal/container-linux/kubernetes/cl/worker.yaml.tmpl
+++ b/bare-metal/container-linux/kubernetes/cl/worker.yaml.tmpl
@ -47,8 +47,6 @@ systemd:
        ExecStartPre=/bin/mkdir -p /opt/cni/bin
        ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
        ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
-        ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
-        ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
        ExecStartPre=/bin/mkdir -p /var/lib/cni
        ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
        ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
@ -81,8 +79,8 @@ storage:
      mode: 0644
      contents:
        inline: |
-          KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
-          KUBELET_IMAGE_TAG=v1.10.0
+          KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
+          KUBELET_IMAGE_TAG=v1.10.1
    - path: /etc/hostname
      filesystem: root
      mode: 0644
--- a/bare-metal/container-linux/kubernetes/groups.tf
+++ b/bare-metal/container-linux/kubernetes/groups.tf
@ -8,10 +8,6 @@ resource "matchbox_group" "container-linux-install" {
  selector {
    mac = "${element(concat(var.controller_macs, var.worker_macs), count.index)}"
  }
-
-  metadata {
-    ssh_authorized_key = "${var.ssh_authorized_key}"
-  }
 }

 resource "matchbox_group" "controller" {
--- a/bare-metal/container-linux/kubernetes/profiles.tf
+++ b/bare-metal/container-linux/kubernetes/profiles.tf
@ -32,6 +32,7 @@ data "template_file" "container-linux-install-configs" {
    ignition_endpoint       = "${format("%s/ignition", var.matchbox_http_endpoint)}"
    install_disk            = "${var.install_disk}"
    container_linux_oem     = "${var.container_linux_oem}"
+    ssh_authorized_key      = "${var.ssh_authorized_key}"

    # only cached-container-linux profile adds -b baseurl
    baseurl_flag = ""
@ -73,6 +74,7 @@ data "template_file" "cached-container-linux-install-configs" {
    ignition_endpoint       = "${format("%s/ignition", var.matchbox_http_endpoint)}"
    install_disk            = "${var.install_disk}"
    container_linux_oem     = "${var.container_linux_oem}"
+    ssh_authorized_key      = "${var.ssh_authorized_key}"

    # profile uses -b baseurl to install from matchbox cache
    baseurl_flag = "-b ${var.matchbox_http_endpoint}/assets/coreos"
--- a/digital-ocean/container-linux/kubernetes/README.md
+++ b/digital-ocean/container-linux/kubernetes/README.md
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster

 ## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>

-* Kubernetes v1.10.0 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
+* Kubernetes v1.10.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
 * Single or multi-master, workloads isolated on workers, [flannel](https://github.com/coreos/flannel) networking
 * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
 * Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
--- a/digital-ocean/container-linux/kubernetes/bootkube.tf
+++ b/digital-ocean/container-linux/kubernetes/bootkube.tf
@ -1,6 +1,6 @@
 # Self-hosted Kubernetes assets (kubeconfig, manifests)
 module "bootkube" {
-  source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=5f3546b66ffb9946b36e612537bb6a1830ae7746"
+  source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=db36b92abced3c4b0af279adfd5ed4bf0cf8c39f"

  cluster_name          = "${var.cluster_name}"
  api_servers           = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
--- a/digital-ocean/container-linux/kubernetes/cl/controller.yaml.tmpl
+++ b/digital-ocean/container-linux/kubernetes/cl/controller.yaml.tmpl
@ -7,12 +7,13 @@ systemd:
        - name: 40-etcd-cluster.conf
          contents: |
            [Service]
-            Environment="ETCD_IMAGE_TAG=v3.3.2"
+            Environment="ETCD_IMAGE_TAG=v3.3.3"
            Environment="ETCD_NAME=${etcd_name}"
            Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
            Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
            Environment="ETCD_LISTEN_CLIENT_URLS=https://0.0.0.0:2379"
            Environment="ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380"
+            Environment="ETCD_LISTEN_METRICS_URLS=http://0.0.0.0:2381"
            Environment="ETCD_INITIAL_CLUSTER=${etcd_initial_cluster}"
            Environment="ETCD_STRICT_RECONFIG_CHECK=true"
            Environment="ETCD_SSL_DIR=/etc/ssl/etcd"
@ -122,8 +123,8 @@ storage:
      mode: 0644
      contents:
        inline: |
-          KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
-          KUBELET_IMAGE_TAG=v1.10.0
+          KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
+          KUBELET_IMAGE_TAG=v1.10.1
    - path: /etc/sysctl.d/max-user-watches.conf
      filesystem: root
      contents:
@ -144,7 +145,7 @@ storage:
          # Move experimental manifests
          [ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
          BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
-          BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.11.0}"
+          BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.12.0}"
          BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
          exec /usr/bin/rkt run \
            --trust-keys-from-https \
--- a/digital-ocean/container-linux/kubernetes/cl/worker.yaml.tmpl
+++ b/digital-ocean/container-linux/kubernetes/cl/worker.yaml.tmpl
@ -50,8 +50,6 @@ systemd:
        ExecStartPre=/bin/mkdir -p /opt/cni/bin
        ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
        ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
-        ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
-        ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
        ExecStartPre=/bin/mkdir -p /var/lib/cni
        ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
        ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
@ -95,8 +93,8 @@ storage:
      mode: 0644
      contents:
        inline: |
-          KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
-          KUBELET_IMAGE_TAG=v1.10.0
+          KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
+          KUBELET_IMAGE_TAG=v1.10.1
    - path: /etc/sysctl.d/max-user-watches.conf
      filesystem: root
      contents:
@ -114,7 +112,7 @@ storage:
            --volume config,kind=host,source=/etc/kubernetes \
            --mount volume=config,target=/etc/kubernetes \
            --insecure-options=image \
-            docker://gcr.io/google_containers/hyperkube:v1.10.0 \
+            docker://k8s.gcr.io/hyperkube:v1.10.1 \
            --net=host \
            --dns=host \
            --exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)
--- a/docs/advanced/worker-pools.md
+++ b/docs/advanced/worker-pools.md
@ -13,7 +13,7 @@ Create a cluster following the AWS [tutorial](../aws.md#cluster). Define a worke

 ```tf
 module "tempest-worker-pool" {
-  source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes/workers?ref=v1.10.0"
+  source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes/workers?ref=v1.10.1"
  
  providers = {
    aws = "aws.default"
@ -77,7 +77,7 @@ Create a cluster following the Google Cloud [tutorial](../google-cloud.md#cluste

 ```tf
 module "yavin-worker-pool" {
-  source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes/workers?ref=v1.10.0"
+  source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes/workers?ref=v1.10.1"

  providers = {
    google = "google.default"
@ -111,11 +111,11 @@ Verify a managed instance group of workers joins the cluster within a few minute
 ```
 $ kubectl get nodes
 NAME                                             STATUS   AGE    VERSION
-yavin-controller-0.c.example-com.internal        Ready    6m     v1.10.0
-yavin-worker-jrbf.c.example-com.internal         Ready    5m     v1.10.0
-yavin-worker-mzdm.c.example-com.internal         Ready    5m     v1.10.0
-yavin-16x-worker-jrbf.c.example-com.internal     Ready    3m     v1.10.0
-yavin-16x-worker-mzdm.c.example-com.internal     Ready    3m     v1.10.0
+yavin-controller-0.c.example-com.internal        Ready    6m     v1.10.1
+yavin-worker-jrbf.c.example-com.internal         Ready    5m     v1.10.1
+yavin-worker-mzdm.c.example-com.internal         Ready    5m     v1.10.1
+yavin-16x-worker-jrbf.c.example-com.internal     Ready    3m     v1.10.1
+yavin-16x-worker-mzdm.c.example-com.internal     Ready    3m     v1.10.1
 ```

 ### Variables
--- a/docs/aws.md
+++ b/docs/aws.md
@ -1,6 +1,6 @@
 # AWS

-In this tutorial, we'll create a Kubernetes v1.10.0 cluster on AWS.
+In this tutorial, we'll create a Kubernetes v1.10.1 cluster on AWS.

 We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, a VPC, gateway, subnets, auto-scaling groups of controllers and workers, network load balancers for controllers and workers, and security groups will be created.

@ -96,7 +96,7 @@ Define a Kubernetes cluster using the module `aws/container-linux/kubernetes`.

 ```tf
 module "aws-tempest" {
-  source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes?ref=v1.10.0"
+  source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes?ref=v1.10.1"

  providers = {
    aws = "aws.default"
@ -149,7 +149,7 @@ Get or update Terraform modules.
 $ terraform get            # downloads missing modules
 $ terraform get --update   # updates all modules
 Get: git::https://github.com/poseidon/typhoon (update)
-Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.11.0 (update)
+Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.12.0 (update)
 ```

 Plan the resources to be created.
@ -181,9 +181,9 @@ In 4-8 minutes, the Kubernetes cluster will be ready.
 $ export KUBECONFIG=/home/user/.secrets/clusters/tempest/auth/kubeconfig
 $ kubectl get nodes
 NAME             STATUS    AGE       VERSION        
-ip-10-0-12-221   Ready     34m       v1.10.0
-ip-10-0-19-112   Ready     34m       v1.10.0
-ip-10-0-4-22     Ready     34m       v1.10.0
+ip-10-0-12-221   Ready     34m       v1.10.1
+ip-10-0-19-112   Ready     34m       v1.10.1
+ip-10-0-4-22     Ready     34m       v1.10.1
 ```

 List the pods.
--- a/docs/bare-metal.md
+++ b/docs/bare-metal.md
@ -1,6 +1,6 @@
 # Bare-Metal

-In this tutorial, we'll network boot and provision a Kubernetes v1.10.0 cluster on bare-metal.
+In this tutorial, we'll network boot and provision a Kubernetes v1.10.1 cluster on bare-metal.

 First, we'll deploy a [Matchbox](https://github.com/coreos/matchbox) service and setup a network boot environment. Then, we'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Container Linux to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers.

@ -22,10 +22,7 @@ Collect a MAC address from each machine. For machines with multiple PXE-enabled
 * 52:54:00:b2:2f:86 (node2)
 * 52:54:00:c3:61:77 (node3)

-Configure each machine to boot from the disk [^1] through IPMI or the BIOS menu.
-
-
-[^1]: Configuring "diskless" workers that always PXE boot is possible, but not in the scope of this tutorial.
+Configure each machine to boot from the disk through IPMI or the BIOS menu.

 ```
 ipmitool -H node1 -U USER -P PASS chassis bootdev disk options=persistent
@ -177,7 +174,7 @@ Define a Kubernetes cluster using the module `bare-metal/container-linux/kuberne

 ```tf
 module "bare-metal-mercury" {
-  source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.10.0"
+  source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.10.1"
  
  providers = {
    local = "local.default"
@ -244,7 +241,7 @@ Get or update Terraform modules.
 $ terraform get            # downloads missing modules
 $ terraform get --update   # updates all modules
 Get: git::https://github.com/poseidon/typhoon (update)
-Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.11.0 (update)
+Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.12.0 (update)
 ```

 Plan the resources to be created.
@ -295,10 +292,19 @@ module.bare-metal-mercury.null_resource.bootkube-start: Creation complete (ID: 5
 Apply complete! Resources: 55 added, 0 changed, 0 destroyed.
 ```

+To watch the install to disk (until machines reboot from disk), SSH to port 2222.
+
+```
+# before v1.10.1
+$ ssh debug@node1.example.com
+# after v1.10.1
+$ ssh -p 2222 core@node1.example.com
+```
+
 To watch the bootstrap process in detail, SSH to the first controller and journal the logs.

 ```
-$ ssh node1.example.com
+$ ssh core@node1.example.com
 $ journalctl -f -u bootkube
 bootkube[5]:         Pod Status:        pod-checkpointer        Running
 bootkube[5]:         Pod Status:          kube-apiserver        Running
@ -316,9 +322,9 @@ bootkube[5]: Tearing down temporary bootstrap control plane...
 $ export KUBECONFIG=/home/user/.secrets/clusters/mercury/auth/kubeconfig
 $ kubectl get nodes
 NAME                STATUS    AGE       VERSION
-node1.example.com   Ready     11m       v1.10.0
-node2.example.com   Ready     11m       v1.10.0
-node3.example.com   Ready     11m       v1.10.0
+node1.example.com   Ready     11m       v1.10.1
+node2.example.com   Ready     11m       v1.10.1
+node3.example.com   Ready     11m       v1.10.1
 ```

 List the pods.
--- a/docs/digital-ocean.md
+++ b/docs/digital-ocean.md
@ -1,6 +1,6 @@
 # Digital Ocean

-In this tutorial, we'll create a Kubernetes v1.10.0 cluster on Digital Ocean.
+In this tutorial, we'll create a Kubernetes v1.10.1 cluster on Digital Ocean.

 We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, firewall rules, DNS records, tags, and droplets for Kubernetes controllers and workers will be created.

@ -90,7 +90,7 @@ Define a Kubernetes cluster using the module `digital-ocean/container-linux/kube

 ```tf
 module "digital-ocean-nemo" {
-  source = "git::https://github.com/poseidon/typhoon//digital-ocean/container-linux/kubernetes?ref=v1.10.0"
+  source = "git::https://github.com/poseidon/typhoon//digital-ocean/container-linux/kubernetes?ref=v1.10.1"
  
  providers = {
    digitalocean = "digitalocean.default"
@ -143,7 +143,7 @@ Get or update Terraform modules.
 $ terraform get            # downloads missing modules
 $ terraform get --update   # updates all modules
 Get: git::https://github.com/poseidon/typhoon (update)
-Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.11.0 (update)
+Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.12.0 (update)
 ```

 Plan the resources to be created.
@ -176,9 +176,9 @@ In 3-6 minutes, the Kubernetes cluster will be ready.
 $ export KUBECONFIG=/home/user/.secrets/clusters/nemo/auth/kubeconfig
 $ kubectl get nodes
 NAME             STATUS    AGE       VERSION
-10.132.110.130   Ready     10m       v1.10.0
-10.132.115.81    Ready     10m       v1.10.0
-10.132.124.107   Ready     10m       v1.10.0
+10.132.110.130   Ready     10m       v1.10.1
+10.132.115.81    Ready     10m       v1.10.1
+10.132.124.107   Ready     10m       v1.10.1
 ```

 List the pods.
--- a/docs/google-cloud.md
+++ b/docs/google-cloud.md
@ -1,6 +1,6 @@
 # Google Cloud

-In this tutorial, we'll create a Kubernetes v1.10.0 cluster on Google Compute Engine (not GKE).
+In this tutorial, we'll create a Kubernetes v1.10.1 cluster on Google Compute Engine (not GKE).

 We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, a network, firewall rules, managed instance groups of Kubernetes controllers and workers, network load balancers for controllers and workers, and health checks will be created.

@ -97,7 +97,7 @@ Define a Kubernetes cluster using the module `google-cloud/container-linux/kuber

 ```tf
 module "google-cloud-yavin" {
-  source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.10.0"
+  source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.10.1"
  
  providers = {
    google   = "google.default"
@ -150,7 +150,7 @@ Get or update Terraform modules.
 $ terraform get            # downloads missing modules
 $ terraform get --update   # updates all modules
 Get: git::https://github.com/poseidon/typhoon (update)
-Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.11.0 (update)
+Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.12.0 (update)
 ```

 Plan the resources to be created.
@ -184,9 +184,9 @@ In 4-8 minutes, the Kubernetes cluster will be ready.
 $ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
 $ kubectl get nodes
 NAME                                          STATUS   AGE    VERSION
-yavin-controller-0.c.example-com.internal     Ready    6m     v1.10.0
-yavin-worker-jrbf.c.example-com.internal      Ready    5m     v1.10.0
-yavin-worker-mzdm.c.example-com.internal      Ready    5m     v1.10.0
+yavin-controller-0.c.example-com.internal     Ready    6m     v1.10.1
+yavin-worker-jrbf.c.example-com.internal      Ready    5m     v1.10.1
+yavin-worker-mzdm.c.example-com.internal      Ready    5m     v1.10.1
 ```

 List the pods.
--- a/docs/index.md
+++ b/docs/index.md
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster

 ## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>

-* Kubernetes v1.10.0 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
+* Kubernetes v1.10.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
 * Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
 * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
 * Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/) and [preemption](https://typhoon.psdn.io/google-cloud/#preemption) (varies by platform)
@ -44,7 +44,7 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo

 ```tf
 module "google-cloud-yavin" {
-  source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.10.0"
+  source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.10.1"
  
  providers = {
    google   = "google.default"
@ -85,9 +85,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
 $ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
 $ kubectl get nodes
 NAME                                          STATUS   AGE    VERSION
-yavin-controller-0.c.example-com.internal     Ready    6m     v1.10.0
-yavin-worker-jrbf.c.example-com.internal      Ready    5m     v1.10.0
-yavin-worker-mzdm.c.example-com.internal      Ready    5m     v1.10.0
+yavin-controller-0.c.example-com.internal     Ready    6m     v1.10.1
+yavin-worker-jrbf.c.example-com.internal      Ready    5m     v1.10.1
+yavin-worker-mzdm.c.example-com.internal      Ready    5m     v1.10.1
 ```

 List the pods.
--- a/docs/topics/hardware.md
+++ b/docs/topics/hardware.md
@ -162,14 +162,14 @@ show ip bgp neighbors
 show ip route bgp
 ```

-Be sure to register the peer by creating a Calico `bgpPeer` CRD with `kubectl apply`.
+Be sure to register the peer by creating a Calico `BGPPeer` CRD with `kubectl apply`.

 ```
-apiVersion: v1
-kind: bgpPeer
+apiVersion: crd.projectcalico.org/v1
+kind: BGPPeer
 metadata:
-  peerIP: LAN_IP
-  scope: global
+  name: NAME
 spec:
+  peerIP: LAN_IP
  asNumber: 64512
 ```
--- a/docs/topics/maintenance.md
+++ b/docs/topics/maintenance.md
@ -18,7 +18,7 @@ module "google-cloud-yavin" {
 }

 module "bare-metal-mercury" {
-  source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.10.0"
+  source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.10.1"
  ...
 }
 ```
--- a/google-cloud/container-linux/kubernetes/README.md
+++ b/google-cloud/container-linux/kubernetes/README.md
@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster

 ## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>

-* Kubernetes v1.10.0 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
+* Kubernetes v1.10.1 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
 * Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
 * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
 * Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
--- a/google-cloud/container-linux/kubernetes/controllers/network.tf
+++ b/google-cloud/container-linux/kubernetes/controllers/network.tf
@ -17,7 +17,7 @@ resource "google_dns_record_set" "controllers" {
  rrdatas = ["${google_compute_address.controllers-ip.address}"]
 }

-# Network Load Balancer (i.e. forwarding rule)
+# Network Load Balancer for controllers
 resource "google_compute_forwarding_rule" "controller-https-rule" {
  name       = "${var.cluster_name}-controller-https-rule"
  ip_address = "${google_compute_address.controllers-ip.address}"
--- a/google-cloud/container-linux/kubernetes/bootkube.tf
+++ b/google-cloud/container-linux/kubernetes/bootkube.tf
@ -1,10 +1,10 @@
 # Self-hosted Kubernetes assets (kubeconfig, manifests)
 module "bootkube" {
-  source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=5f3546b66ffb9946b36e612537bb6a1830ae7746"
+  source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=db36b92abced3c4b0af279adfd5ed4bf0cf8c39f"

  cluster_name          = "${var.cluster_name}"
  api_servers           = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
-  etcd_servers          = "${module.controllers.etcd_fqdns}"
+  etcd_servers          = ["${null_resource.repeat.*.triggers.domain}"]
  asset_dir             = "${var.asset_dir}"
  networking            = "${var.networking}"
  network_mtu           = 1440
--- a/google-cloud/container-linux/kubernetes/controllers/cl/controller.yaml.tmpl
+++ b/google-cloud/container-linux/kubernetes/controllers/cl/controller.yaml.tmpl
@ -7,12 +7,13 @@ systemd:
        - name: 40-etcd-cluster.conf
          contents: |
            [Service]
-            Environment="ETCD_IMAGE_TAG=v3.3.2"
+            Environment="ETCD_IMAGE_TAG=v3.3.3"
            Environment="ETCD_NAME=${etcd_name}"
            Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
            Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
            Environment="ETCD_LISTEN_CLIENT_URLS=https://0.0.0.0:2379"
            Environment="ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380"
+            Environment="ETCD_LISTEN_METRICS_URLS=http://0.0.0.0:2381"
            Environment="ETCD_INITIAL_CLUSTER=${etcd_initial_cluster}"
            Environment="ETCD_STRICT_RECONFIG_CHECK=true"
            Environment="ETCD_SSL_DIR=/etc/ssl/etcd"
@ -117,8 +118,8 @@ storage:
      mode: 0644
      contents:
        inline: |
-          KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
-          KUBELET_IMAGE_TAG=v1.10.0
+          KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
+          KUBELET_IMAGE_TAG=v1.10.1
    - path: /etc/sysctl.d/max-user-watches.conf
      filesystem: root
      contents:
@ -139,7 +140,7 @@ storage:
          # Move experimental manifests
          [ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
          BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
-          BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.11.0}"
+          BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.12.0}"
          BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
          exec /usr/bin/rkt run \
            --trust-keys-from-https \
--- a/google-cloud/container-linux/kubernetes/cluster.tf
+++ b/google-cloud/container-linux/kubernetes/cluster.tf
@ -1,44 +0,0 @@
-module "controllers" {
-  source       = "controllers"
-  cluster_name = "${var.cluster_name}"
-
-  # GCE
-  region        = "${var.region}"
-  network       = "${google_compute_network.network.name}"
-  dns_zone      = "${var.dns_zone}"
-  dns_zone_name = "${var.dns_zone_name}"
-  count         = "${var.controller_count}"
-  machine_type  = "${var.controller_type}"
-  os_image      = "${var.os_image}"
-  disk_size     = "${var.disk_size}"
-
-  # configuration
-  networking            = "${var.networking}"
-  kubeconfig            = "${module.bootkube.kubeconfig}"
-  ssh_authorized_key    = "${var.ssh_authorized_key}"
-  service_cidr          = "${var.service_cidr}"
-  cluster_domain_suffix = "${var.cluster_domain_suffix}"
-  clc_snippets          = "${var.controller_clc_snippets}"
-}
-
-module "workers" {
-  source       = "workers"
-  name         = "${var.cluster_name}"
-  cluster_name = "${var.cluster_name}"
-
-  # GCE
-  region       = "${var.region}"
-  network      = "${google_compute_network.network.name}"
-  count        = "${var.worker_count}"
-  machine_type = "${var.worker_type}"
-  os_image     = "${var.os_image}"
-  disk_size    = "${var.disk_size}"
-  preemptible  = "${var.worker_preemptible}"
-
-  # configuration
-  kubeconfig            = "${module.bootkube.kubeconfig}"
-  ssh_authorized_key    = "${var.ssh_authorized_key}"
-  service_cidr          = "${var.service_cidr}"
-  cluster_domain_suffix = "${var.cluster_domain_suffix}"
-  clc_snippets          = "${var.worker_clc_snippets}"
-}
--- a/google-cloud/container-linux/kubernetes/controllers/controllers.tf
+++ b/google-cloud/container-linux/kubernetes/controllers/controllers.tf
@ -1,6 +1,6 @@
 # Discrete DNS records for each controller's private IPv4 for etcd usage
 resource "google_dns_record_set" "etcds" {
-  count = "${var.count}"
+  count = "${var.controller_count}"

  # DNS Zone name where record should be created
  managed_zone = "${var.dns_zone_name}"
@ -21,11 +21,11 @@ data "google_compute_zones" "all" {

 # Controller instances
 resource "google_compute_instance" "controllers" {
-  count = "${var.count}"
+  count = "${var.controller_count}"

  name         = "${var.cluster_name}-controller-${count.index}"
  zone         = "${element(data.google_compute_zones.all.names, count.index)}"
-  machine_type = "${var.machine_type}"
+  machine_type = "${var.controller_type}"

  metadata {
    user-data = "${element(data.ct_config.controller_ign.*.rendered, count.index)}"
@ -41,7 +41,7 @@ resource "google_compute_instance" "controllers" {
  }

  network_interface {
-    network = "${var.network}"
+    network = "${google_compute_network.network.name}"

    # Ephemeral external IP
    access_config = {}
@ -51,9 +51,13 @@ resource "google_compute_instance" "controllers" {
  tags           = ["${var.cluster_name}-controller"]
 }

+locals {
+  controllers_ipv4_public = ["${google_compute_instance.controllers.*.network_interface.0.access_config.0.assigned_nat_ip}"]
+}
+
 # Controller Container Linux Config
 data "template_file" "controller_config" {
-  count = "${var.count}"
+  count = "${var.controller_count}"

  template = "${file("${path.module}/cl/controller.yaml.tmpl")}"

@ -65,7 +69,7 @@ data "template_file" "controller_config" {
    # etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...
    etcd_initial_cluster = "${join(",", formatlist("%s=https://%s:2380", null_resource.repeat.*.triggers.name, null_resource.repeat.*.triggers.domain))}"

-    kubeconfig            = "${indent(10, var.kubeconfig)}"
+    kubeconfig            = "${indent(10, module.bootkube.kubeconfig)}"
    ssh_authorized_key    = "${var.ssh_authorized_key}"
    k8s_dns_service_ip    = "${cidrhost(var.service_cidr, 10)}"
    cluster_domain_suffix = "${var.cluster_domain_suffix}"
@ -75,7 +79,7 @@ data "template_file" "controller_config" {
 # Horrible hack to generate a Terraform list of a desired length without dependencies.
 # Ideal ${repeat("etcd", 3) -> ["etcd", "etcd", "etcd"]}
 resource null_resource "repeat" {
-  count = "${var.count}"
+  count = "${var.controller_count}"

  triggers {
    name   = "etcd${count.index}"
@ -84,8 +88,8 @@ resource null_resource "repeat" {
 }

 data "ct_config" "controller_ign" {
-  count        = "${var.count}"
+  count        = "${var.controller_count}"
  content      = "${element(data.template_file.controller_config.*.rendered, count.index)}"
  pretty_print = false
-  snippets     = ["${var.clc_snippets}"]
+  snippets     = ["${var.controller_clc_snippets}"]
 }
--- a/google-cloud/container-linux/kubernetes/controllers/outputs.tf
+++ b/google-cloud/container-linux/kubernetes/controllers/outputs.tf
@ -1,7 +0,0 @@
-output "etcd_fqdns" {
-  value = ["${null_resource.repeat.*.triggers.domain}"]
-}
-
-output "ipv4_public" {
-  value = ["${google_compute_instance.controllers.*.network_interface.0.access_config.0.assigned_nat_ip}"]
-}
--- a/google-cloud/container-linux/kubernetes/controllers/variables.tf
+++ b/google-cloud/container-linux/kubernetes/controllers/variables.tf
@ -1,87 +0,0 @@
-variable "cluster_name" {
-  type        = "string"
-  description = "Unique cluster name"
-}
-
-variable "region" {
-  type        = "string"
-  description = "Google Cloud region (e.g. us-central1, see `gcloud compute regions list`)."
-}
-
-variable "network" {
-  type        = "string"
-  description = "Name of the network to attach to the compute instance interfaces"
-}
-
-variable "dns_zone" {
-  type        = "string"
-  description = "Google Cloud DNS Zone value to create etcd/k8s subdomains (e.g. dghubble.io)"
-}
-
-variable "dns_zone_name" {
-  type        = "string"
-  description = "Google Cloud DNS Zone name to create etcd/k8s subdomains (e.g. dghubble-io)"
-}
-
-# instances
-
-variable "count" {
-  type        = "string"
-  description = "Number of controller compute instances the instance group should manage"
-}
-
-variable "machine_type" {
-  type        = "string"
-  description = "Machine type for compute instances (e.g. gcloud compute machine-types list)"
-}
-
-variable "os_image" {
-  type        = "string"
-  description = "OS image from which to initialize the disk (e.g. gcloud compute images list)"
-}
-
-variable "disk_size" {
-  type        = "string"
-  default     = "40"
-  description = "Size of the disk in GB"
-}
-
-# configuration
-
-variable "networking" {
-  description = "Choice of networking provider (flannel or calico)"
-  type        = "string"
-  default     = "calico"
-}
-
-variable "kubeconfig" {
-  type        = "string"
-  description = "Generated Kubelet kubeconfig"
-}
-
-variable "ssh_authorized_key" {
-  type        = "string"
-  description = "SSH public key for logging in as user 'core'"
-}
-
-variable "service_cidr" {
-  description = <<EOD
-CIDR IPv4 range to assign Kubernetes services.
-The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
-EOD
-
-  type    = "string"
-  default = "10.3.0.0/16"
-}
-
-variable "cluster_domain_suffix" {
-  description = "Queries for domains with the suffix will be answered by kube-dns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
-  type        = "string"
-  default     = "cluster.local"
-}
-
-variable "clc_snippets" {
-  type        = "list"
-  description = "Container Linux Config snippets"
-  default     = []
-}
--- a/google-cloud/container-linux/kubernetes/network.tf
+++ b/google-cloud/container-linux/kubernetes/network.tf
@ -56,6 +56,20 @@ resource "google_compute_firewall" "internal-etcd" {
  target_tags = ["${var.cluster_name}-controller"]
 }

+# Allow Prometheus to scrape etcd metrics
+resource "google_compute_firewall" "internal-etcd-metrics" {
+  name    = "${var.cluster_name}-internal-etcd-metrics"
+  network = "${google_compute_network.network.name}"
+
+  allow {
+    protocol = "tcp"
+    ports    = [2381]
+  }
+
+  source_tags = ["${var.cluster_name}-worker"]
+  target_tags = ["${var.cluster_name}-controller"]
+}
+
 # Calico BGP and IPIP
 # https://docs.projectcalico.org/v2.5/reference/public-cloud/gce
 resource "google_compute_firewall" "internal-calico" {
@ -93,7 +107,7 @@ resource "google_compute_firewall" "internal-flannel" {
  target_tags = ["${var.cluster_name}-controller", "${var.cluster_name}-worker"]
 }

-# Allow prometheus (workload) to scrape node-exporter daemonset
+# Allow Prometheus to scrape node-exporter daemonset
 resource "google_compute_firewall" "internal-node-exporter" {
  name    = "${var.cluster_name}-internal-node-exporter"
  network = "${google_compute_network.network.name}"
--- a/google-cloud/container-linux/kubernetes/outputs.tf
+++ b/google-cloud/container-linux/kubernetes/outputs.tf
@ -1,19 +1,22 @@
+# Deprecated
 output "controllers_ipv4_public" {
-  value = ["${module.controllers.ipv4_public}"]
+  value = ["${google_compute_instance.controllers.*.network_interface.0.access_config.0.assigned_nat_ip}"]
 }

 output "ingress_static_ip" {
  value = "${module.workers.ingress_static_ip}"
 }

-output "network_name" {
-  value = "${google_compute_network.network.name}"
-}
-
 output "network_self_link" {
  value = "${google_compute_network.network.self_link}"
 }

+# Outputs for worker pools
+
+output "network_name" {
+  value = "${google_compute_network.network.name}"
+}
+
 output "kubeconfig" {
  value = "${module.bootkube.kubeconfig}"
 }
--- a/google-cloud/container-linux/kubernetes/ssh.tf
+++ b/google-cloud/container-linux/kubernetes/ssh.tf
@ -4,7 +4,7 @@ resource "null_resource" "copy-controller-secrets" {

  connection {
    type    = "ssh"
-    host    = "${element(module.controllers.ipv4_public, count.index)}"
+    host    = "${element(local.controllers_ipv4_public, count.index)}"
    user    = "core"
    timeout = "15m"
  }
@ -65,14 +65,14 @@ resource "null_resource" "copy-controller-secrets" {
 resource "null_resource" "bootkube-start" {
  depends_on = [
    "module.bootkube",
-    "module.controllers",
    "module.workers",
+    "google_dns_record_set.controllers",
    "null_resource.copy-controller-secrets",
  ]

  connection {
    type    = "ssh"
-    host    = "${element(module.controllers.ipv4_public, 0)}"
+    host    = "${element(local.controllers_ipv4_public, 0)}"
    user    = "core"
    timeout = "15m"
  }
--- a/google-cloud/container-linux/kubernetes/variables.tf
+++ b/google-cloud/container-linux/kubernetes/variables.tf
@ -34,13 +34,13 @@ variable "worker_count" {
  description = "Number of workers"
 }

-variable controller_type {
+variable "controller_type" {
  type        = "string"
  default     = "n1-standard-1"
  description = "Machine type for controllers (see `gcloud compute machine-types list`)"
 }

-variable worker_type {
+variable "worker_type" {
  type        = "string"
  default     = "n1-standard-1"
  description = "Machine type for controllers (see `gcloud compute machine-types list`)"
--- a/google-cloud/container-linux/kubernetes/workers.tf
+++ b/google-cloud/container-linux/kubernetes/workers.tf
@ -0,0 +1,21 @@
+module "workers" {
+  source       = "workers"
+  name         = "${var.cluster_name}"
+  cluster_name = "${var.cluster_name}"
+
+  # GCE
+  region       = "${var.region}"
+  network      = "${google_compute_network.network.name}"
+  count        = "${var.worker_count}"
+  machine_type = "${var.worker_type}"
+  os_image     = "${var.os_image}"
+  disk_size    = "${var.disk_size}"
+  preemptible  = "${var.worker_preemptible}"
+
+  # configuration
+  kubeconfig            = "${module.bootkube.kubeconfig}"
+  ssh_authorized_key    = "${var.ssh_authorized_key}"
+  service_cidr          = "${var.service_cidr}"
+  cluster_domain_suffix = "${var.cluster_domain_suffix}"
+  clc_snippets          = "${var.worker_clc_snippets}"
+}
--- a/google-cloud/container-linux/kubernetes/workers/cl/worker.yaml.tmpl
+++ b/google-cloud/container-linux/kubernetes/workers/cl/worker.yaml.tmpl
@ -40,8 +40,6 @@ systemd:
        ExecStartPre=/bin/mkdir -p /opt/cni/bin
        ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
        ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
-        ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
-        ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
        ExecStartPre=/bin/mkdir -p /var/lib/cni
        ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
        ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
@ -90,8 +88,8 @@ storage:
      mode: 0644
      contents:
        inline: |
-          KUBELET_IMAGE_URL=docker://gcr.io/google_containers/hyperkube
-          KUBELET_IMAGE_TAG=v1.10.0
+          KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
+          KUBELET_IMAGE_TAG=v1.10.1
    - path: /etc/sysctl.d/max-user-watches.conf
      filesystem: root
      contents:
@ -109,7 +107,7 @@ storage:
            --volume config,kind=host,source=/etc/kubernetes \
            --mount volume=config,target=/etc/kubernetes \
            --insecure-options=image \
-            docker://gcr.io/google_containers/hyperkube:v1.10.0 \
+            docker://k8s.gcr.io/hyperkube:v1.10.1 \
            --net=host \
            --dns=host \
            --exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)
--- a/requirements.txt
+++ b/requirements.txt
@ -1,5 +1,5 @@
-mkdocs==0.17.2
-mkdocs-material==2.2.6
+mkdocs==0.17.3
+mkdocs-material==2.7.1
 pygments==2.2.0
 pymdown-extensions==3.5
 six==1.10.0
Author	SHA1	Message	Date
Dalton Hubble	77c0a4cf2e	Update Kubernetes from v1.10.0 to v1.10.1 * Use kubernetes-incubator/bootkube v0.12.0	2018-04-12 20:57:31 -07:00
Dalton Hubble	5035d56db2	Refactor GCP to remove controller internal module * Remove the controller internal module to align with other platforms and since its not a supported use case	2018-04-12 19:41:51 -07:00
Dalton Hubble	9bb3de5327	Skip creating unused dirs on worker nodes	2018-04-11 22:23:51 -07:00
Dalton Hubble	c8eabc2af4	Fix GCP controller_type and worker_type vars	2018-04-11 22:19:58 -07:00
Matt Dorn	2eaf858c5c	Update example BGPPeer manifest Previous example may have been outdated. It resulted in `error: unable to recognize "example.yaml": no matches for /, Kind=bgpPeer` . See https://docs.projectcalico.org/v3.0/reference/calicoctl/resources/bgppeer.	2018-04-09 23:23:18 -05:00
Dalton Hubble	b8656fd74b	Clarify bare-metal SSH instructions	2018-04-08 14:11:05 -07:00
Dalton Hubble	d276fffcda	Fix bare-metal multiple apply/ssh on Terraform v0.11.4+ * Terraform v0.11.4 introduced changes to remote-exec that mean Typhoon bare-metal clusters require multiple runs of terraform apply to ssh and bootstrap. * Bare-metal installs PXE boot a live instance to install to disk and then reboot from disk as controllers/workers. Terraform remote-exec has no way to "know" to wait until the reboot has occurred to kickoff Kubernetes bootstrap. Previously Typhoon created a "debug" user during this install phase to allow an admin to SSH, but remote-exec would hang, trying to connect as user "core". Terraform v0.11.4 changes this behavior so remote-exec fails and a user must re-run terraform apply until succeeding. * A new way to "trick" remote-exec into waiting for the reboot into the disk install is to run SSH on a non-standard port during the disk install. This retains the ability for an admin to SSH during install (most distros don't have this) and fixes the issue so only a single run of terraform apply is needed. * https://github.com/hashicorp/terraform/pull/17359#issuecomment-376415464	2018-04-08 13:32:31 -07:00
Dalton Hubble	6b08bde479	Use k8s.gcr.io instead of gcr.io/google_containers * Kubernetes recommends using the alias to fetch images from the nearest GCR regional mirror, to abstract the use of GCR, and to drop names containing 'google' * https://groups.google.com/forum/#!msg/kubernetes-dev/ytjk_rNrTa0/3EFUHvovCAAJ	2018-04-08 12:57:52 -07:00
Dalton Hubble	f4b2396718	Return Prometheus deployment to be a worker workload * Expose etcd metrics to workers so Prometheus can run on a worker, rather than a controller * Drop temporary firewall rules allowing Prometheus to run on a controller and scrape targes * Related to https://github.com/poseidon/typhoon/pull/175	2018-04-08 12:20:00 -07:00
Dalton Hubble	b76126db93	Update docs builder and material theme	2018-04-08 00:00:03 -07:00
Dalton Hubble	7186aa46da	Update kube-state-metrics from v1.2.0 to v1.3.0 * https://github.com/kubernetes/kube-state-metrics/pull/412 * https://github.com/kubernetes/kube-state-metrics/pull/413	2018-04-04 21:04:13 -07:00
Dalton Hubble	18dbaf74ce	Update kube-dns from v1.14.8 to v1.14.9 * https://github.com/kubernetes/kubernetes/pull/61908	2018-04-04 21:00:23 -07:00
Dalton Hubble	ce001e9d56	Update etcd from v3.3.2 to v3.3.3 * https://github.com/coreos/etcd/releases/tag/v3.3.3	2018-04-04 20:32:24 -07:00
Dalton Hubble	d770393dbc	Add etcd metrics, Prometheus scrapes, and Grafana dash * Use etcd v3.3 --listen-metrics-urls to expose only metrics data via http://0.0.0.0:2381 on controllers * Add Prometheus discovery for etcd peers on controller nodes * Temporarily drop two noisy Prometheus alerts	2018-04-03 20:31:00 -07:00