2017-08-07 03:27:45 +02:00
# Google Cloud
2024-10-26 17:33:43 +02:00
In this tutorial, we'll create a Kubernetes v1.31.2 cluster on Google Compute Engine with Flatcar Linux.
2017-08-07 03:27:45 +02:00
2018-04-24 06:36:20 +02:00
We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a network, firewall rules, health checks, controller instances, worker managed instance group, load balancers, and TLS assets.
2017-08-07 03:27:45 +02:00
Introduce the component system for managing pre-installed addons
* Previously: Typhoon provisions clusters with kube-system components
like CoreDNS, kube-proxy, and a chosen CNI provider (among flannel,
Calico, or Cilium) pre-installed. This is convenient since clusters
come with "batteries included". But it also means upgrading these
components is generally done in lock-step, by upgrading to a new
Typhoon / Kubernetes release
* It can be valuable to manage these components with a separate
plan/apply process or through automations and deploy systems. For
example, this allows managing CoreDNS separately from the cluster's
lifecycle.
* These "components" will continue to be pre-installed by default,
but a new `components` variable allows them to be disabled and
managed as "addons", components you apply after cluster creation
and manage on a rolling basis. For some of these, we may provide
Terraform modules to aide in managing these components.
```
module "cluster" {
# defaults
components = {
enable = true
coredns = {
enable = true
}
kube_proxy = {
enable = true
}
# Only the CNI set in var.networking will be installed
flannel = {
enable = true
}
calico = {
enable = true
}
cilium = {
enable = true
}
}
}
```
An earlier variable `install_container_networking = true/false` has
been removed, since it can now be achieved with this more extensible
and general components mechanism by setting the chosen networking
provider enable field to false.
2024-05-19 00:05:33 +02:00
Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` service. Worker hosts run a `kubelet` service. Controller nodes run `kube-apiserver` , `kube-scheduler` , `kube-controller-manager` , and `coredns` , while `kube-proxy` and (`flannel`, `calico` , or `cilium` ) run on every node. A generated `kubeconfig` provides `kubectl` access to the cluster.
2017-08-07 03:27:45 +02:00
## Requirements
* Google Cloud Account and Service Account
* Google Cloud DNS Zone (registered Domain Name or delegated subdomain)
2020-08-11 06:02:56 +02:00
* Terraform v0.13.0+
2017-08-07 03:27:45 +02:00
## Terraform Setup
2020-08-11 06:02:56 +02:00
Install [Terraform ](https://www.terraform.io/downloads.html ) v0.13.0+ on your system.
2017-08-07 03:27:45 +02:00
```sh
$ terraform version
2021-06-17 22:25:43 +02:00
Terraform v1.0.0
2017-08-07 03:27:45 +02:00
```
2018-10-07 19:57:16 +02:00
Read [concepts ](/architecture/concepts/ ) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra` ).
2017-08-07 03:27:45 +02:00
```
cd infra/clusters
```
## Provider
Login to your Google Console [API Manager ](https://console.cloud.google.com/apis/dashboard ) and select a project, or [signup ](https://cloud.google.com/free/ ) if you don't have an account.
2019-03-02 19:54:35 +01:00
Select "Credentials" and create a service account key. Choose the "Compute Engine Admin" and "DNS Administrator" roles and save the JSON private key to a file that can be referenced in configs.
2017-08-07 03:27:45 +02:00
```sh
mv ~/Downloads/project-id-43048204.json ~/.config/google-cloud/terraform.json
```
Configure the Google Cloud provider to use your service account key, project-id, and region in a `providers.tf` file.
```tf
provider "google" {
project = "project-id"
region = "us-central1"
2019-10-01 07:04:35 +02:00
credentials = file("~/.config/google-cloud/terraform.json")
2017-08-07 03:27:45 +02:00
}
2018-01-12 15:56:08 +01:00
2020-08-11 06:02:56 +02:00
provider "ct" {}
terraform {
required_providers {
ct = {
source = "poseidon/ct"
2022-08-18 18:04:20 +02:00
version = "0.11.0"
2020-08-11 06:02:56 +02:00
}
google = {
source = "hashicorp/google"
2023-04-17 18:34:33 +02:00
version = "4.59.0"
2020-08-11 06:02:56 +02:00
}
}
2018-11-27 09:08:51 +01:00
}
2017-08-07 03:27:45 +02:00
```
Additional configuration options are described in the `google` provider [docs ](https://www.terraform.io/docs/providers/google/index.html ).
!!! tip
2018-04-24 06:36:20 +02:00
Regions are listed in [docs ](https://cloud.google.com/compute/docs/regions-zones/regions-zones ) or with `gcloud compute regions list` . A project may contain multiple clusters across different regions.
2017-08-07 03:27:45 +02:00
## Cluster
2020-10-21 07:47:19 +02:00
Define a Kubernetes cluster using the module `google-cloud/flatcar-linux/kubernetes` .
2017-08-07 03:27:45 +02:00
```tf
2019-12-06 07:56:42 +01:00
module "yavin" {
2024-10-26 17:33:43 +02:00
source = "git::https://github.com/poseidon/typhoon//google-cloud/flatcar-linux/kubernetes?ref=v1.31.2"
2017-08-07 03:27:45 +02:00
# Google Cloud
2018-03-26 05:41:52 +02:00
cluster_name = "yavin"
2017-11-04 18:57:12 +01:00
region = "us-central1"
2017-08-07 03:27:45 +02:00
dns_zone = "example.com"
dns_zone_name = "example-zone"
2024-08-03 00:01:48 +02:00
# instances
worker_count = 2
2018-03-26 05:41:52 +02:00
# configuration
2017-08-07 03:27:45 +02:00
ssh_authorized_key = "ssh-rsa AAAAB3Nz..."
}
```
2020-10-21 07:47:19 +02:00
Reference the [variables docs ](#variables ) or the [variables.tf ](https://github.com/poseidon/typhoon/blob/master/google-cloud/flatcar-linux/kubernetes/variables.tf ) source.
2017-08-07 03:27:45 +02:00
## ssh-agent
2019-09-06 08:12:09 +02:00
Initial bootstrapping requires `bootstrap.service` be started on one controller node. Terraform uses `ssh-agent` to automate this step. Add your SSH private key to `ssh-agent` .
2017-08-07 03:27:45 +02:00
```sh
ssh-add ~/.ssh/id_rsa
ssh-add -L
```
## Apply
Initialize the config directory if this is the first use with Terraform.
```sh
terraform init
```
Plan the resources to be created.
```sh
$ terraform plan
2022-08-21 17:52:35 +02:00
Plan: 78 to add, 0 to change, 0 to destroy.
2017-08-07 03:27:45 +02:00
```
Apply the changes to create the cluster.
```sh
$ terraform apply
2019-12-06 07:56:42 +01:00
module.yavin.null_resource.bootstrap: Still creating... (10s elapsed)
2017-08-07 03:27:45 +02:00
...
2019-12-06 07:56:42 +01:00
module.yavin.null_resource.bootstrap: Still creating... (5m30s elapsed)
module.yavin.null_resource.bootstrap: Still creating... (5m40s elapsed)
module.yavin.null_resource.bootstrap: Creation complete (ID: 5768638456220583358)
2017-08-07 03:27:45 +02:00
2022-08-21 17:52:35 +02:00
Apply complete! Resources: 78 added, 0 changed, 0 destroyed.
2017-08-07 03:27:45 +02:00
```
2017-11-07 06:19:11 +01:00
In 4-8 minutes, the Kubernetes cluster will be ready.
2017-08-07 03:27:45 +02:00
## Verify
2019-12-06 07:56:42 +01:00
[Install kubectl ](https://kubernetes.io/docs/tasks/tools/install-kubectl/ ) on your system. Obtain the generated cluster `kubeconfig` from module outputs (e.g. write to a local file).
2017-08-07 03:27:45 +02:00
```
2019-12-06 07:56:42 +01:00
resource "local_file" "kubeconfig-yavin" {
2024-09-24 16:25:58 +02:00
content = module.yavin.kubeconfig-admin
filename = "/home/user/.kube/configs/yavin-config"
file_permission = "0600"
2019-12-06 07:56:42 +01:00
}
```
List nodes in the cluster.
```
$ export KUBECONFIG=/home/user/.kube/configs/yavin-config
2017-08-07 03:27:45 +02:00
$ kubectl get nodes
2019-09-23 02:37:23 +02:00
NAME ROLES STATUS AGE VERSION
2024-10-26 17:33:43 +02:00
yavin-controller-0.c.example-com.internal < none > Ready 6m v1.31.2
yavin-worker-jrbf.c.example-com.internal < none > Ready 5m v1.31.2
yavin-worker-mzdm.c.example-com.internal < none > Ready 5m v1.31.2
2017-08-07 03:27:45 +02:00
```
List the pods.
```
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
2024-08-03 00:01:48 +02:00
kube-system cilium-1cs8z 1/1 Running 0 6m
kube-system cilium-d1l5b 1/1 Running 0 6m
kube-system cilium-sp9ps 1/1 Running 0 6m
2018-11-03 23:04:08 +01:00
kube-system coredns-1187388186-dkh3o 1/1 Running 0 6m
2018-07-02 04:41:57 +02:00
kube-system coredns-1187388186-zj5dl 1/1 Running 0 6m
2019-09-06 08:12:09 +02:00
kube-system kube-apiserver-controller-0 1/1 Running 0 6m
kube-system kube-controller-manager-controller-0 1/1 Running 0 6m
2017-08-07 03:27:45 +02:00
kube-system kube-proxy-117v6 1/1 Running 0 6m
kube-system kube-proxy-9886n 1/1 Running 0 6m
kube-system kube-proxy-njn47 1/1 Running 0 6m
2019-09-06 08:12:09 +02:00
kube-system kube-scheduler-controller-0 1/1 Running 0 6m
2017-08-07 03:27:45 +02:00
```
## Going Further
2018-10-07 19:57:16 +02:00
Learn about [maintenance ](/topics/maintenance/ ) and [addons ](/addons/overview/ ).
2017-08-07 03:27:45 +02:00
## Variables
2020-10-21 07:47:19 +02:00
Check the [variables.tf ](https://github.com/poseidon/typhoon/blob/master/google-cloud/flatcar-linux/kubernetes/variables.tf ) source.
2018-04-24 06:36:20 +02:00
2017-08-07 03:27:45 +02:00
### Required
| Name | Description | Example |
|:-----|:------------|:--------|
| cluster_name | Unique cluster name (prepended to dns_zone) | "yavin" |
2017-11-04 18:57:12 +01:00
| region | Google Cloud region | "us-central1" |
2017-08-07 03:27:45 +02:00
| dns_zone | Google Cloud DNS zone | "google-cloud.example.com" |
| dns_zone_name | Google Cloud DNS zone name | "example-zone" |
2018-03-26 06:36:10 +02:00
| ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." |
2017-08-07 03:27:45 +02:00
2017-11-06 09:51:20 +01:00
Check the list of valid [regions ](https://cloud.google.com/compute/docs/regions-zones/regions-zones ) and list Container Linux [images ](https://cloud.google.com/compute/docs/images ) with `gcloud compute images list | grep coreos` .
2017-08-07 03:27:45 +02:00
#### DNS Zone
2018-04-24 06:36:20 +02:00
Clusters create a DNS A record `${cluster_name}.${dns_zone}` to resolve a TCP proxy load balancer backed by controller instances. This FQDN is used by workers and `kubectl` to access the apiserver(s). In this example, the cluster's apiserver would be accessible at `yavin.google-cloud.example.com` .
2017-08-07 03:27:45 +02:00
2018-04-24 06:36:20 +02:00
You'll need a registered domain name or delegated subdomain on Google Cloud DNS. You can set this up once and create many clusters with unique names.
2017-08-07 03:27:45 +02:00
```tf
resource "google_dns_managed_zone" "zone-for-clusters" {
dns_name = "google-cloud.example.com."
name = "example-zone"
description = "Production DNS zone"
}
```
!!! tip ""
2018-04-24 06:36:20 +02:00
If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on Google Cloud (e.g. google-cloud.mydomain.com) and [update nameservers ](https://cloud.google.com/dns/update-name-servers ).
2017-08-07 03:27:45 +02:00
### Optional
2024-09-20 23:31:17 +02:00
| Name | Description | Default | Example |
|:---------------------|:---------------------------------------------------------------------------|:-----------------|:--------------------------------------------|
| os_image | Flatcar Linux image for compute instances | "flatcar-stable" | flatcar-stable, flatcar-beta, flatcar-alpha |
| controller_count | Number of controllers (i.e. masters) | 1 | 3 |
| controller_type | Machine type for controllers | "n1-standard-1" | See below |
| controller_disk_size | Controller disk size in GB | 30 | 20 |
| worker_count | Number of workers | 1 | 3 |
| worker_type | Machine type for workers | "n1-standard-1" | See below |
| worker_disk_size | Worker disk size in GB | 30 | 100 |
| worker_preemptible | If enabled, Compute Engine will terminate workers randomly within 24 hours | false | true |
| controller_snippets | Controller Container Linux Config snippets | [] | [example ](/advanced/customization/ ) |
| worker_snippets | Worker Container Linux Config snippets | [] | [example ](/advanced/customization/ ) |
| networking | Choice of networking provider | "cilium" | "calico" or "cilium" or "flannel" |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| worker_node_labels | List of initial worker node labels | [] | ["worker-pool=default"] |
2017-08-07 03:27:45 +02:00
Check the list of valid [machine types ](https://cloud.google.com/compute/docs/machine-types ).
#### Preemption
2019-08-15 06:25:38 +02:00
Add `worker_preemptible = "true"` to allow worker nodes to be [preempted ](https://cloud.google.com/compute/docs/instances/preemptible ) at random, but pay [significantly ](https://cloud.google.com/compute/pricing ) less. Clusters tolerate stopping instances fairly well (reschedules pods, but cannot drain) and preemption provides a nice reward for running fault-tolerant cluster systems.`