Migrate Fedora CoreOS bare-metal to static pod control plane

* Run a kube-apiserver, kube-scheduler, and kube-controller-manager
static pod on each controller node. Previously, kube-apiserver was
self-hosted as a DaemonSet across controllers and kube-scheduler
and kube-controller-manager were a Deployment (with 2 or
controller_count many replicas).
* Remove bootkube bootstrap and pivot to self-hosted
* Remove pod-checkpointer manifests (no longer needed)
This commit is contained in:
Dalton Hubble 2019-09-03 22:00:34 -07:00
parent b60a2ecdf7
commit 74780fb09f
8 changed files with 65 additions and 52 deletions

View File

@ -110,7 +110,7 @@ systemd:
- name: bootstrap.service - name: bootstrap.service
contents: | contents: |
[Unit] [Unit]
Description=Bootstrap Kubernetes control plane Description=Kubernetes control plane
ConditionPathExists=!/opt/bootstrap/bootstrap.done ConditionPathExists=!/opt/bootstrap/bootstrap.done
[Service] [Service]
Type=oneshot Type=oneshot

View File

@ -11,7 +11,7 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a> ## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.15.3 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube)) * Kubernetes v1.15.3 (upstream)
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking * Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization * Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization

View File

@ -1,6 +1,6 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests) # Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" { module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=98cc19f80f2c4c3ddc63fc7aea6320e74bec561a" source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=6e59af71138bc5f784453873074de16e7ee150eb"
cluster_name = var.cluster_name cluster_name = var.cluster_name
api_servers = [var.k8s_domain_name] api_servers = [var.k8s_domain_name]

View File

@ -118,33 +118,48 @@ systemd:
PathExists=/etc/kubernetes/kubeconfig PathExists=/etc/kubernetes/kubeconfig
[Install] [Install]
WantedBy=multi-user.target WantedBy=multi-user.target
- name: bootkube.service - name: bootstrap.service
contents: | contents: |
[Unit] [Unit]
Description=Bootstrap a Kubernetes control plane Description=Kubernetes control plane
ConditionPathExists=!/opt/bootkube/init_bootkube.done ConditionPathExists=!/opt/bootstrap/bootstrap.done
[Service] [Service]
Type=oneshot Type=oneshot
RemainAfterExit=true RemainAfterExit=true
WorkingDirectory=/opt/bootkube WorkingDirectory=/opt/bootstrap
ExecStart=/usr/bin/bash -c 'set -x && \ ExecStartPre=-/usr/bin/bash -c 'set -x && [ -n "$(ls /opt/bootstrap/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootstrap/assets/manifests-*/* /opt/bootstrap/assets/manifests && rm -rf /opt/bootstrap/assets/manifests-*'
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-* && exec podman run --name bootkube --privileged \ ExecStart=/usr/bin/podman run --name bootstrap \
--network host \ --network host \
--volume /opt/bootkube/assets:/assets \ --volume /opt/bootstrap/assets:/assets:ro,Z \
--volume /etc/kubernetes:/etc/kubernetes \ --volume /opt/bootstrap/apply:/apply:ro,Z \
quay.io/coreos/bootkube:v0.14.0 \ k8s.gcr.io/hyperkube:v1.15.3 \
/bootkube start --asset-dir=/assets' /apply
ExecStartPost=/bin/touch /opt/bootkube/init_bootkube.done ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done
ExecStartPost=-/usr/bin/podman stop bootstrap
storage: storage:
directories: directories:
- path: /etc/kubernetes - path: /etc/kubernetes
- path: /opt/bootkube - path: /opt/bootstrap
files: files:
- path: /etc/hostname - path: /etc/hostname
mode: 0644 mode: 0644
contents: contents:
inline: inline:
${domain_name} ${domain_name}
- path: /opt/bootstrap/apply
mode: 0544
contents:
inline: |
#!/bin/bash -e
export KUBECONFIG=/assets/auth/kubeconfig
until kubectl version; do
echo "Waiting for static pod control plane"
sleep 5
done
until kubectl apply -f /assets/manifests -R; do
echo "Retry applying manifests"
sleep 5
done
- path: /etc/sysctl.d/reverse-path-filter.conf - path: /etc/sysctl.d/reverse-path-filter.conf
contents: contents:
inline: | inline: |

View File

@ -89,7 +89,6 @@ systemd:
storage: storage:
directories: directories:
- path: /etc/kubernetes - path: /etc/kubernetes
- path: /opt/bootkube
files: files:
- path: /etc/hostname - path: /etc/hostname
mode: 0644 mode: 0644

View File

@ -1,4 +1,4 @@
# Secure copy etcd TLS assets and kubeconfig to controllers. Activates kubelet.service # Secure copy assets to controllers. Activates kubelet.service
resource "null_resource" "copy-controller-secrets" { resource "null_resource" "copy-controller-secrets" {
count = length(var.controller_names) count = length(var.controller_names)
@ -7,6 +7,7 @@ resource "null_resource" "copy-controller-secrets" {
depends_on = [ depends_on = [
matchbox_group.controller, matchbox_group.controller,
matchbox_group.worker, matchbox_group.worker,
module.bootkube,
] ]
connection { connection {
@ -56,6 +57,11 @@ resource "null_resource" "copy-controller-secrets" {
destination = "$HOME/etcd-peer.key" destination = "$HOME/etcd-peer.key"
} }
provisioner "file" {
source = var.asset_dir
destination = "$HOME/assets"
}
provisioner "remote-exec" { provisioner "remote-exec" {
inline = [ inline = [
"sudo mkdir -p /etc/ssl/etcd/etcd", "sudo mkdir -p /etc/ssl/etcd/etcd",
@ -67,6 +73,11 @@ resource "null_resource" "copy-controller-secrets" {
"sudo mv etcd-peer.crt /etc/ssl/etcd/etcd/peer.crt", "sudo mv etcd-peer.crt /etc/ssl/etcd/etcd/peer.crt",
"sudo mv etcd-peer.key /etc/ssl/etcd/etcd/peer.key", "sudo mv etcd-peer.key /etc/ssl/etcd/etcd/peer.key",
"sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig", "sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
"sudo mv $HOME/assets /opt/bootstrap/assets",
"sudo mkdir -p /etc/kubernetes/bootstrap-secrets",
"sudo cp -r /opt/bootstrap/assets/tls/* /etc/kubernetes/bootstrap-secrets/",
"sudo cp /opt/bootstrap/assets/auth/kubeconfig /etc/kubernetes/bootstrap-secrets/",
"sudo cp -r /opt/bootstrap/assets/static-manifests/* /etc/kubernetes/manifests/"
] ]
} }
} }
@ -101,9 +112,8 @@ resource "null_resource" "copy-worker-secrets" {
} }
} }
# Secure copy bootkube assets to ONE controller and start bootkube to perform # Connect to a controller to perform one-time cluster bootstrap.
# one-time self-hosted cluster bootstrapping. resource "null_resource" "bootstrap" {
resource "null_resource" "bootkube-start" {
# Without depends_on, this remote-exec may start before the kubeconfig copy. # Without depends_on, this remote-exec may start before the kubeconfig copy.
# Terraform only does one task at a time, so it would try to bootstrap # Terraform only does one task at a time, so it would try to bootstrap
# while no Kubelets are running. # while no Kubelets are running.
@ -119,15 +129,9 @@ resource "null_resource" "bootkube-start" {
timeout = "15m" timeout = "15m"
} }
provisioner "file" {
source = var.asset_dir
destination = "$HOME/assets"
}
provisioner "remote-exec" { provisioner "remote-exec" {
inline = [ inline = [
"sudo mv $HOME/assets /opt/bootkube", "sudo systemctl start bootstrap",
"sudo systemctl start bootkube",
] ]
} }
} }

View File

@ -7,7 +7,7 @@ In this tutorial, we'll create a Kubernetes v1.15.3 cluster on AWS with Fedora C
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a VPC, gateway, subnets, security groups, controller instances, worker auto-scaling group, network load balancer, and TLS assets. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a VPC, gateway, subnets, security groups, controller instances, worker auto-scaling group, network load balancer, and TLS assets.
Controllers hosts are provisioned to run an `etcd-member` peer and a `kubelet` service. Worker hosts run a `kubelet` service. Controller nodes run `kube-apiserver`, `kube-scheduler`, `kube-controller-manager`, and `coredns`, while `kube-proxy` and `calico` (or `flannel`) run on every node. A generated `kubeconfig` provides `kubectl` access to the cluster. Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` service. Worker hosts run a `kubelet` service. Controller nodes run `kube-apiserver`, `kube-scheduler`, `kube-controller-manager`, and `coredns`, while `kube-proxy` and `calico` (or `flannel`) run on every node. A generated `kubeconfig` provides `kubectl` access to the cluster.
## Requirements ## Requirements
@ -153,7 +153,7 @@ kube-system calico-node-7jmr1 2/2 Running 0
kube-system calico-node-bknc8 2/2 Running 0 34m kube-system calico-node-bknc8 2/2 Running 0 34m
kube-system coredns-1187388186-wx1lg 1/1 Running 0 34m kube-system coredns-1187388186-wx1lg 1/1 Running 0 34m
kube-system coredns-1187388186-qjnvp 1/1 Running 0 34m kube-system coredns-1187388186-qjnvp 1/1 Running 0 34m
kube-system kube-apiserver-4mjbk 1/1 Running 0 34m kube-system kube-apiserver-ip-10-0-3-155 1/1 Running 0 34m
kube-system kube-controller-manager-ip-10-0-3-155 1/1 Running 0 34m kube-system kube-controller-manager-ip-10-0-3-155 1/1 Running 0 34m
kube-system kube-proxy-14wxv 1/1 Running 0 34m kube-system kube-proxy-14wxv 1/1 Running 0 34m
kube-system kube-proxy-9vxh2 1/1 Running 0 34m kube-system kube-proxy-9vxh2 1/1 Running 0 34m

View File

@ -7,7 +7,7 @@ In this tutorial, we'll network boot and provision a Kubernetes v1.15.3 cluster
First, we'll deploy a [Matchbox](https://github.com/poseidon/matchbox) service and setup a network boot environment. Then, we'll declare a Kubernetes cluster using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Fedora CoreOS to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers via Ignition. First, we'll deploy a [Matchbox](https://github.com/poseidon/matchbox) service and setup a network boot environment. Then, we'll declare a Kubernetes cluster using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Fedora CoreOS to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers via Ignition.
Controllers are provisioned to run an `etcd-member` peer and a `kubelet` service. Workers run just a `kubelet` service. A one-time [bootkube](https://github.com/kubernetes-incubator/bootkube) bootstrap schedules the `apiserver`, `scheduler`, `controller-manager`, and `coredns` on controllers and schedules `kube-proxy` and `calico` (or `flannel`) on every node. A generated `kubeconfig` provides `kubectl` access to the cluster. Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` service. Worker hosts run a `kubelet` service. Controller nodes run `kube-apiserver`, `kube-scheduler`, `kube-controller-manager`, and `coredns`, while `kube-proxy` and `calico` (or `flannel`) run on every node. A generated `kubeconfig` provides `kubectl` access to the cluster.
## Requirements ## Requirements
@ -200,7 +200,7 @@ Reference the [variables docs](#variables) or the [variables.tf](https://github.
## ssh-agent ## ssh-agent
Initial bootstrapping requires `bootkube.service` be started on one controller node. Terraform uses `ssh-agent` to automate this step. Add your SSH private key to `ssh-agent`. Initial bootstrapping requires `bootstrap.service` be started on one controller node. Terraform uses `ssh-agent` to automate this step. Add your SSH private key to `ssh-agent`.
```sh ```sh
ssh-add ~/.ssh/id_rsa ssh-add ~/.ssh/id_rsa
@ -222,7 +222,7 @@ $ terraform plan
Plan: 55 to add, 0 to change, 0 to destroy. Plan: 55 to add, 0 to change, 0 to destroy.
``` ```
Apply the changes. Terraform will generate bootkube assets to `asset_dir` and create Matchbox profiles (e.g. controller, worker) and matching rules via the Matchbox API. Apply the changes. Terraform will generate bootstrap assets to `asset_dir` and create Matchbox profiles (e.g. controller, worker) and matching rules via the Matchbox API.
```sh ```sh
$ terraform apply $ terraform apply
@ -251,14 +251,14 @@ Machines will network boot, install Fedora CoreOS to disk, reboot into the disk
### Bootstrap ### Bootstrap
Wait for the `bootkube-start` step to finish bootstrapping the Kubernetes control plane. This may take 5-15 minutes depending on your network. Wait for the `bootstrap` step to finish bootstrapping the Kubernetes control plane. This may take 5-15 minutes depending on your network.
``` ```
module.bare-metal-mercury.null_resource.bootkube-start: Still creating... (6m10s elapsed) module.bare-metal-mercury.null_resource.bootstrap: Still creating... (6m10s elapsed)
module.bare-metal-mercury.null_resource.bootkube-start: Still creating... (6m20s elapsed) module.bare-metal-mercury.null_resource.bootstrap: Still creating... (6m20s elapsed)
module.bare-metal-mercury.null_resource.bootkube-start: Still creating... (6m30s elapsed) module.bare-metal-mercury.null_resource.bootstrap: Still creating... (6m30s elapsed)
module.bare-metal-mercury.null_resource.bootkube-start: Still creating... (6m40s elapsed) module.bare-metal-mercury.null_resource.bootstrap: Still creating... (6m40s elapsed)
module.bare-metal-mercury.null_resource.bootkube-start: Creation complete (ID: 5441741360626669024) module.bare-metal-mercury.null_resource.bootstrap: Creation complete (ID: 5441741360626669024)
Apply complete! Resources: 55 added, 0 changed, 0 destroyed. Apply complete! Resources: 55 added, 0 changed, 0 destroyed.
``` ```
@ -267,13 +267,12 @@ To watch the bootstrap process in detail, SSH to the first controller and journa
``` ```
$ ssh core@node1.example.com $ ssh core@node1.example.com
$ journalctl -f -u bootkube $ journalctl -f -u bootstrap
bootkube[5]: Pod Status: pod-checkpointer Running podman[1750]: The connection to the server cluster.example.com:6443 was refused - did you specify the right host or port?
bootkube[5]: Pod Status: kube-apiserver Running podman[1750]: Waiting for static pod control plane
bootkube[5]: Pod Status: kube-scheduler Running ...
bootkube[5]: Pod Status: kube-controller-manager Running podman[1750]: serviceaccount/calico-node unchanged
bootkube[5]: All self-hosted control plane components successfully started systemd[1]: Started Kubernetes control plane.
bootkube[5]: Tearing down temporary bootstrap control plane...
``` ```
## Verify ## Verify
@ -299,16 +298,12 @@ kube-system calico-node-gnjrm 2/2 Running 0
kube-system calico-node-llbgt 2/2 Running 0 11m kube-system calico-node-llbgt 2/2 Running 0 11m
kube-system coredns-1187388186-dj3pd 1/1 Running 0 11m kube-system coredns-1187388186-dj3pd 1/1 Running 0 11m
kube-system coredns-1187388186-mx9rt 1/1 Running 0 11m kube-system coredns-1187388186-mx9rt 1/1 Running 0 11m
kube-system kube-apiserver-7336w 1/1 Running 0 11m kube-system kube-apiserver-node1.example.com 1/1 Running 0 11m
kube-system kube-controller-manager-3271970485-b9chx 1/1 Running 0 11m kube-system kube-controller-manager-node1.example.com 1/1 Running 1 11m
kube-system kube-controller-manager-3271970485-v30js 1/1 Running 1 11m
kube-system kube-proxy-50sd4 1/1 Running 0 11m kube-system kube-proxy-50sd4 1/1 Running 0 11m
kube-system kube-proxy-bczhp 1/1 Running 0 11m kube-system kube-proxy-bczhp 1/1 Running 0 11m
kube-system kube-proxy-mp2fw 1/1 Running 0 11m kube-system kube-proxy-mp2fw 1/1 Running 0 11m
kube-system kube-scheduler-3895335239-fd3l7 1/1 Running 1 11m kube-system kube-scheduler-node1.example.com 1/1 Running 0 11m
kube-system kube-scheduler-3895335239-hfjv0 1/1 Running 0 11m
kube-system pod-checkpointer-wf65d 1/1 Running 0 11m
kube-system pod-checkpointer-wf65d-node1.example.com 1/1 Running 0 11m
``` ```
## Going Further ## Going Further