10 KiB
Google Cloud
In this tutorial, we'll create a Kubernetes v1.8.5 cluster on Google Compute Engine (not GKE).
We'll declare a Kubernetes cluster in Terraform using the Typhoon Terraform module. On apply, a network, firewall rules, managed instance groups of Kubernetes controllers and workers, network load balancers for controllers and workers, and health checks will be created.
Controllers and workers are provisioned to run a kubelet
. A one-time bootkube bootstrap schedules an apiserver
, scheduler
, controller-manager
, and kube-dns
on controllers and runs kube-proxy
and calico
or flannel
on each node. A generated kubeconfig
provides kubectl
access to the cluster.
Requirements
- Google Cloud Account and Service Account
- Google Cloud DNS Zone (registered Domain Name or delegated subdomain)
- Terraform v0.10.x and terraform-provider-ct installed locally
Terraform Setup
Install Terraform v0.10.x on your system.
$ terraform version
Terraform v0.10.7
Add the terraform-provider-ct plugin binary for your system.
wget https://github.com/coreos/terraform-provider-ct/releases/download/v0.2.0/terraform-provider-ct-v0.2.0-linux-amd64.tar.gz
tar xzf terraform-provider-ct-v0.2.0-linux-amd64.tar.gz
sudo mv terraform-provider-ct-v0.2.0-linux-amd64/terraform-provider-ct /usr/local/bin/
Add the plugin to your ~/.terraformrc
.
providers {
ct = "/usr/local/bin/terraform-provider-ct"
}
Read concepts to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. infra
).
cd infra/clusters
Provider
Login to your Google Console API Manager and select a project, or signup if you don't have an account.
Select "Credentials", and create service account key credentials. Choose the "Compute Engine default service account" and save the JSON private key to a file that can be referenced in configs.
mv ~/Downloads/project-id-43048204.json ~/.config/google-cloud/terraform.json
Configure the Google Cloud provider to use your service account key, project-id, and region in a providers.tf
file.
provider "google" {
credentials = "${file("~/.config/google-cloud/terraform.json")}"
project = "project-id"
region = "us-central1"
}
Additional configuration options are described in the google
provider docs.
!!! tip
A project may contain multiple clusters if you wish. Regions are listed in docs or with gcloud compute regions list
.
Cluster
Define a Kubernetes cluster using the module google-cloud/container-linux/kubernetes
.
module "google-cloud-yavin" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes"
# Google Cloud
region = "us-central1"
dns_zone = "example.com"
dns_zone_name = "example-zone"
os_image = "coreos-stable-1576-4-0-v20171206"
cluster_name = "yavin"
controller_count = 1
worker_count = 2
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
# output assets dir
asset_dir = "/home/user/.secrets/clusters/yavin"
}
Reference the variables docs or the variables.tf source.
ssh-agent
Initial bootstrapping requires bootkube.service
be started on one controller node. Terraform uses ssh-agent
to automate this step. Add your SSH private key to ssh-agent
.
ssh-add ~/.ssh/id_rsa
ssh-add -L
!!! warning
terraform apply
will hang connecting to a controller if ssh-agent
does not contain the SSH key.
Apply
Initialize the config directory if this is the first use with Terraform.
terraform init
Get or update Terraform modules.
$ terraform get # downloads missing modules
$ terraform get --update # updates all modules
Get: git::https://github.com/poseidon/typhoon (update)
Get: git::https://github.com/poseidon/bootkube-terraform.git?ref=v0.9.1 (update)
Plan the resources to be created.
$ terraform plan
Plan: 64 to add, 0 to change, 0 to destroy.
Apply the changes to create the cluster.
$ terraform apply
module.google-cloud-yavin.null_resource.bootkube-start: Still creating... (10s elapsed)
...
module.google-cloud-yavin.null_resource.bootkube-start: Still creating... (5m30s elapsed)
module.google-cloud-yavin.null_resource.bootkube-start: Still creating... (5m40s elapsed)
module.google-cloud-yavin.null_resource.bootkube-start: Creation complete (ID: 5768638456220583358)
Apply complete! Resources: 64 added, 0 changed, 0 destroyed.
In 4-8 minutes, the Kubernetes cluster will be ready.
Verify
Install kubectl on your system. Use the generated kubeconfig
credentials to access the Kubernetes cluster and list nodes.
$ KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
yavin-controller-0.c.example-com.internal Ready 6m v1.8.5
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.8.5
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.8.5
List the pods.
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-node-1cs8z 2/2 Running 0 6m
kube-system calico-node-d1l5b 2/2 Running 0 6m
kube-system calico-node-sp9ps 2/2 Running 0 6m
kube-system kube-apiserver-zppls 1/1 Running 0 6m
kube-system kube-controller-manager-3271970485-gh9kt 1/1 Running 0 6m
kube-system kube-controller-manager-3271970485-h90v8 1/1 Running 1 6m
kube-system kube-dns-1187388186-zj5dl 3/3 Running 0 6m
kube-system kube-proxy-117v6 1/1 Running 0 6m
kube-system kube-proxy-9886n 1/1 Running 0 6m
kube-system kube-proxy-njn47 1/1 Running 0 6m
kube-system kube-scheduler-3895335239-5x87r 1/1 Running 0 6m
kube-system kube-scheduler-3895335239-bzrrt 1/1 Running 1 6m
kube-system pod-checkpointer-l6lrt 1/1 Running 0 6m
Going Further
Learn about version pinning, maintenance, and addons.
!!! note
On Container Linux clusters, install the container-linux-update-operator
addon to coordinate reboots and drains when nodes auto-update. Otherwise, updates may not be applied until the next reboot.
Variables
Required
Name | Description | Example |
---|---|---|
cluster_name | Unique cluster name (prepended to dns_zone) | "yavin" |
region | Google Cloud region | "us-central1" |
dns_zone | Google Cloud DNS zone | "google-cloud.example.com" |
dns_zone_name | Google Cloud DNS zone name | "example-zone" |
ssh_authorized_key | SSH public key for ~/.ssh_authorized_keys | "ssh-rsa AAAAB3NZ..." |
os_image | OS image for compute instances | "coreos-stable-1576-4-0-v20171206" |
asset_dir | Path to a directory where generated assets should be placed (contains secrets) | "/home/user/.secrets/clusters/yavin" |
Check the list of valid regions and list Container Linux images with gcloud compute images list | grep coreos
.
DNS Zone
Clusters create a DNS A record ${cluster_name}.${dns_zone}
to resolve a network load balancer backed by controller instances. This FQDN is used by workers and kubectl
to access the apiserver. In this example, the cluster's apiserver would be accessible at yavin.google-cloud.example.com
.
You'll need a registered domain name or subdomain registered in a Google Cloud DNS zone. You can set this up once and create many clusters with unique names.
resource "google_dns_managed_zone" "zone-for-clusters" {
dns_name = "google-cloud.example.com."
name = "example-zone"
description = "Production DNS zone"
}
!!! tip "" If you have an existing domain name with a zone file elsewhere, just carve out a subdomain that can be managed on Google Cloud (e.g. google-cloud.mydomain.com) and update nameservers.
Optional
Name | Description | Default | Example |
---|---|---|---|
machine_type | Machine type for compute instances | "n1-standard-1" | See below |
controller_count | Number of controllers (i.e. masters) | 1 | 1 |
worker_count | Number of workers | 1 | 3 |
worker_preemptible | If enabled, Compute Engine will terminate controllers randomly within 24 hours | false | true |
networking | Choice of networking provider | "calico" | "calico" or "flannel" |
pod_cidr | CIDR range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
service_cidr | CIDR range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
cluster_domain_suffix | FQDN suffix for Kubernetes services answered by kube-dns. | "cluster.local" | "k8s.example.com" |
Check the list of valid machine types.
!!! warning Set controller_count to 1. A bug in Google Cloud network load balancer health checking prevents multiple controllers from bootstrapping. There are workarounds, but they all involve tradeoffs we're uncomfortable recommending. See #54.
Preemption
Add worker_preemeptible = "true"
to allow worker nodes to be preempted at random, but pay significantly less. Clusters tolerate stopping instances fairly well (reschedules pods, but cannot drain) and preemption provides a nice reward for running fault-tolerant cluster systems.`