Allow Calico networking on Azure and DigitalOcean

* Introduce "calico" as a `networking` option on Azure and DigitalOcean
using Calico's new VXLAN support (similar to flannel). Flannel remains
the default on these platforms for now.
* Historically, DigitalOcean and Azure only allowed Flannel as the
CNI provider, since those platforms don't support IPIP traffic that
was previously required for Calico.
* Looking forward, its desireable for Calico to become the default
across Typhoon clusters, since it provides NetworkPolicy and a
consistent experience
* No changes to AWS, GCP, or bare-metal where Calico remains the
default CNI provider. On these platforms, IPIP mode will always
be used, since its available and more performant than vxlan
This commit is contained in:
Dalton Hubble
2019-05-06 00:38:23 -07:00
parent b9bab739ce
commit 147c21a4bd
10 changed files with 56 additions and 20 deletions

View File

@ -253,6 +253,7 @@ Reference the DNS zone with `"${azurerm_dns_zone.clusters.name}"` and its resour
| worker_priority | Set priority to Low to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time | Regular | Low |
| controller_clc_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/#usage) |
| worker_clc_snippets | Worker Container Linux Config snippets | [] | [example](/advanced/customization/#usage) |
| networking | Choice of networking provider | "flannel" | "flannel" or "calico" (experimental) |
| host_cidr | CIDR IPv4 range to assign to instances | "10.0.0.0/16" | "10.0.0.0/20" |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |

View File

@ -253,6 +253,7 @@ Digital Ocean requires the SSH public key be uploaded to your account, so you ma
| image | Container Linux image for instances | "coreos-stable" | coreos-stable, coreos-beta, coreos-alpha |
| controller_clc_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/) |
| worker_clc_snippets | Worker Container Linux Config snippets | [] | [example](/advanced/customization/) |
| networking | Choice of networking provider | "flannel" | "flannel" or "calico" (experimental) |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by coredns. | "cluster.local" | "k8s.example.com" |

View File

@ -26,20 +26,19 @@ Network performance varies based on the platform and CNI plugin. `iperf` was use
|----------------------------|-------:|-------------:|-------------:|
| AWS (flannel) | 5 Gb/s | 4.94 Gb/s | 4.89 Gb/s |
| AWS (calico, MTU 1480) | 5 Gb/s | 4.94 Gb/s | 4.42 Gb/s |
| AWS (calico, MTU 8981) | 5 Gb/s | 4.94 Gb/s | 4.75 Gb/s |
| Azure (flannel) | Varies | 749 Mb/s | 680 Mb/s |
| AWS (calico, MTU 8981) | 5 Gb/s | 4.94 Gb/s | 4.90 Gb/s |
| Azure (flannel) | Varies | 749 Mb/s | 650 Mb/s |
| Azure (calico) | Varies | 749 Mb/s | 650 Mb/s |
| Bare-Metal (flannel) | 1 Gb/s | 940 Mb/s | 903 Mb/s |
| Bare-Metal (calico) | 1 Gb/s | 940 Mb/s | 931 Mb/s |
| Bare-Metal (flannel, bond) | 3 Gb/s | 2.3 Gb/s | 1.17 Gb/s |
| Bare-Metal (calico, bond) | 3 Gb/s | 2.3 Gb/s | 1.17 Gb/s |
| Digital Ocean | 2 Gb/s | 1.97 Gb/s | 1.64 Gb/s |
| Digital Ocean (flannel) | Varies | 1.97 Gb/s | 1.20 Gb/s |
| Digital Ocean (calico) | Varies | 1.97 Gb/s | 1.20 Gb/s |
| Google Cloud (flannel) | 2 Gb/s | 1.94 Gb/s | 1.76 Gb/s |
| Google Cloud (calico) | 2 Gb/s | 1.94 Gb/s | 1.81 Gb/s |
Notes:
* Calico and Flannel have comparable performance. Platform and configuration differences dominate.
* AWS and Azure node bandwidth (i.e. upper bound) depends greatly on machine type
* Azure and DigitalOcean network performance can be quite variable or depend on machine type
* Only [certain AWS EC2 instance types](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/network_mtu.html#jumbo_frame_instances) allow jumbo frames. This is why the default MTU on AWS must be 1480.
* Neither CNI provider seems to be able to leverage bonded NICs well (bare-metal)