* DigitalOcean introduced Virtual Private Cloud (VPC) support
to match other clouds and enhance the prior "private networking"
feature. Before, droplet's belonging to different clusters (but
residing in the same region) could reach one another (although
Typhoon firewall rules prohibit this). Now, droplets in a VPC
reside in their own network
* https://www.digitalocean.com/docs/networking/vpc/
* Create droplet instances in a VPC per cluster. This matches the
design of Typhoon AWS, Azure, and GCP.
* Require `terraform-provider-digitalocean` v1.16.0+ (action required)
* Output `vpc_id` for use with an attached DigitalOcean
loadbalancer
* Accept experimental CNI `networking` mode "cilium"
* Run Cilium v1.8.0-rc4 with overlay vxlan tunnels and a
minimal set of features. We're interested in:
* IPAM: Divide pod_cidr into /24 subnets per node
* CNI networking pod-to-pod, pod-to-external
* BPF masquerade
* NetworkPolicy as defined by Kubernetes (no L7 Policy)
* Continue using kube-proxy with Cilium probe mode
* Firewall changes:
* Require UDP 8472 for vxlan (Linux kernel default) between nodes
* Optional ICMP echo(8) between nodes for host reachability
(health)
* Optional TCP 4240 between nodes for endpoint reachability (health)
Known Issues:
* Containers with `hostPort` don't listen on all host addresses,
these workloads must use `hostNetwork` for now
https://github.com/cilium/cilium/issues/12116
* Erroneous warning on Fedora CoreOS
https://github.com/cilium/cilium/issues/10256
Note: This is experimental. It is not listed in docs and may be
changed or removed without a deprecation notice
Related:
* https://github.com/poseidon/terraform-render-bootstrap/pull/192
* https://github.com/cilium/cilium/issues/12217
* DigitalOcean firewall rules should reference Terraform tag
resources rather than using tag strings. Otherwise, terraform
apply can fail (neeeds rerun) if a tag has not yet been created
* Configure kube-proxy --metrics-bind-address=0.0.0.0 (default
127.0.0.1) to serve metrics on 0.0.0.0:10249
* Add firewall rules to allow Prometheus (resides on a worker) to
scrape kube-proxy service endpoints on controllers or workers
* Add a clusterIP: None service for kube-proxy endpoint discovery
* Run a kube-apiserver, kube-scheduler, and kube-controller-manager
static pod on each controller node. Previously, kube-apiserver was
self-hosted as a DaemonSet across controllers and kube-scheduler
and kube-controller-manager were a Deployment (with 2 or
controller_count many replicas).
* Remove bootkube bootstrap and pivot to self-hosted
* Remove pod-checkpointer manifests (no longer needed)
* Replace v0.11 bracket type hints with Terraform v0.12 list expressions
* Use expression syntax instead of interpolated strings, where suggested
* Update DigitalOcean tutorial documentation
* Define Terraform and plugin version requirements in versions.tf
* Require digitalocean ~> v1.3 to support Terraform v0.12
* Require ct ~> v0.3.2 to support Terraform v0.12
* Change flannel port from the kernel default 8472 to the
IANA assigned VXLAN port 4789
* Update firewall rules or security groups for VXLAN
* Why now? Calico now offers its own VXLAN backend so
standardizing on the IANA port will simplify config
* https://github.com/coreos/flannel/blob/master/Documentation/backends.md#vxlan
* Add an `enable_aggregation` variable to enable the kube-apiserver
aggregation layer for adding extension apiservers to clusters
* Aggregation is **disabled** by default. Typhoon recommends you not
enable aggregation. Consider whether less invasive ways to achieve your
goals are possible and whether those goals are well-founded
* Enabling aggregation and extension apiservers increases the attack
surface of a cluster and makes extensions a part of the control plane.
Admins must scrutinize and trust any extension apiserver used.
* Passing a v1.14 CNCF conformance test requires aggregation be enabled.
Having an option for aggregation keeps compliance, but retains the
stricter security posture on default clusters
* Define firewall rules on DigitialOcean to match rules used on AWS,
GCP, and Azure
* Output `controller_tag` and `worker_tag` to simplify custom firewall
rule creation
* Adjust firewall rules, security groups, cloud load balancers,
and generated kubeconfig's
* Facilitates some future simplifications and cost reductions
* Bare-Metal users who exposed kube-apiserver on a WAN via their
router or load balancer will need to adjust its configuration.
This is uncommon, most apiserver are on LAN and/or behind VPN
so no routing infrastructure is configured with the port number
* Change port range from keyword "all" to "1-65535", which is the
same but with digitalocean provider 0.1.3 doesn't produce a diff
* Rearrange egress firewall rules to order the Digtial Ocean API
and provider returns. In current testing, this fixes the last diff
that was present on `terraform plan`.
* Without a reference a Digital Ocean tag object, terraform may
try to create a firewall rule before a tag actually exists. By
referencing the actual tag objects, the dependency order is
implied