Compare commits

...

163 Commits

Author SHA1 Message Date
1d66ad33f7 Change AWS worker modules' default type from t2.small to t3.small
* Worker instance types weren't updated in #365
2019-01-12 00:07:48 -08:00
4d32b79c6f Update Kubernetes from v1.13.1 to v1.13.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1132
2019-01-12 00:00:53 -08:00
df4c0ba05d Use HTTPS liveness probes for kube-scheduler and kube-controller-manager
* Disable kube-scheduler and kube-controller-manager HTTP ports
2019-01-09 20:56:50 -08:00
bfe0c74793 Enable the certificates.k8s.io API to issue cluster certificates
* System components that require certificates signed by the cluster
CA can submit a CSR to the apiserver, have an administrator inspect
and approve it, and be issued a certificate
* Configure kube-controller-manager to sign Approved CSR's using the
cluster CA private key
* Admins are responsible for approving or denying CSRs, otherwise,
no certificate is issued. Read the Kubernetes docs carefully and
verify the entity making the request and the authorization level
* https://kubernetes.io/docs/tasks/tls/managing-tls-in-a-cluster
2019-01-06 17:33:37 -08:00
60c70797ec Use a single format of the admin kubeconfig
* Use a single admin kubeconfig for initial bootkube bootstrap
and for use by a human admin. Previously, an admin kubeconfig
without a named context was used for bootstrap and direct usage
with KUBECONFIG=path, while one with a named context was used
for `kubectl config use-context` style usage. Confusing.
* Provide the admin kubeconfig via `assets/auth/kubeconfig`,
`assets/auth/CLUSTER-config`, or output `kubeconfig-admin`
2019-01-05 14:57:18 -08:00
6795a753ea Update CoreDNS from v1.2.6 to v1.3.0
* https://coredns.io/2018/12/15/coredns-1.3.0-release/
2019-01-05 13:35:03 -08:00
b57273b6f1 Rename internal kube_dns_service_ip to cluster_dns_service_ip
* terraform-render-bootkube module deprecated kube_dns_service_ip
output in favor of cluster_dns_service_ip
* Rename k8s_dns_service_ip to cluster_dns_service_ip for
consistency too
2019-01-05 13:32:03 -08:00
812a1adb49 Use a lower-privilege Kubelet kubeconfig in system:nodes
* Kubelets can use a lower-privilege TLS client certificate with
Org system:nodes and a binding to the system:node ClusterRole
* Admin kubeconfig's continue to belong to Org system:masters to
provide cluster-admin (available in assets/auth/kubeconfig or as
a Terraform output kubeconfig-admin)
* Remove bare-metal output variable kubeconfig
2019-01-05 13:08:56 -08:00
1c6a0392ad Fix missing slash in links in the AWS tutorial 2019-01-02 23:33:02 -08:00
5263d00a6f Update mkdocs-material from v3.1.0 to v3.2.0 2019-01-02 23:31:49 -08:00
66e1365cc4 Add ServiceAccounts for kube-apiserver and kube-scheduler
* Add ServiceAccounts and ClusterRoleBindings for kube-apiserver
and kube-scheduler
* Remove the ClusterRoleBinding for the kube-system default ServiceAccount
* Rename the CA certificate CommonName for consistency with upstream
2019-01-01 20:16:14 -08:00
ea8b0d1c84 Update Prometheus addon from v2.5.0 to v2.6.0
* https://github.com/prometheus/prometheus/releases/tag/v2.6.0
2018-12-27 07:35:12 -08:00
f2f4deb8bb Change AWS default type from t2.small to t3.small
* T3 is the next generation general purpose burstable
instance type. Compared with t2.small, the t3.small is
cheaper, has 2 vCPU (instead of 1) and provides 5 Gbps
of pod-to-pod bandwidth (instead of 1 Gbps)
2018-12-18 12:38:35 -08:00
4d2f33aee6 Update changelog for v1.13.1 release 2018-12-17 14:28:27 -08:00
d42f47c49e Update terraform-provider-ct plugin from v0.2.1 to v0.3.0
* Provide migration instructions for upgrading terraform-provider-ct
in-place for v1.12.2+ clusters
* Require switching from ~/.terraformrc to the Terraform third-party
plugins directory ~/.terraform.d/plugins/
* Require Container Linux 1688.5.3 or newer
2018-12-17 14:13:50 -08:00
53e549f233 Add Flatcar Linux to the issue template 2018-12-16 10:47:59 -08:00
bcb200186d Add admin kubeconfig as a Terraform output
* May be used to write a local file
2018-12-15 22:52:28 -08:00
479d498024 Update Calico from v3.3.2 to v3.4.0
* https://docs.projectcalico.org/v3.4/releases/
2018-12-15 18:05:16 -08:00
e0c032be94 Increase GCP TCP proxy apiserver backend timeout to 5 minutes
* On GCP, kubectl port-forward connections to pods are closed
after a timeout (unlike AWS NLB's or Azure load balancers)
* Increase the GCP apiserver backend service timeout from 1 minute
to 5 minutes to be more similar to AWS/Azure LB behavior
2018-12-15 17:34:18 -08:00
b74bf11772 Update Grafana from v5.4.0 to v5.4.2
* https://github.com/grafana/grafana/releases/tag/v5.4.2
* https://github.com/grafana/grafana/releases/tag/v5.4.1
2018-12-15 12:39:03 -08:00
018c5edc25 Update Kubernetes from v1.13.0 to v1.13.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1131
2018-12-15 11:44:57 -08:00
8aeec0b9b5 Fix typo in descriptive firewall name (#359) 2018-12-15 11:34:32 -08:00
ff6ab571f3 Update Calico from v3.3.1 to v3.3.2
* https://docs.projectcalico.org/v3.3/releases/
2018-12-06 22:56:55 -08:00
991fb44c37 Update Grafana from v5.3.4 to v5.4.0
* https://github.com/grafana/grafana/releases/tag/v5.4.0
2018-12-06 01:33:50 -08:00
d31f444fcd Update Kubernetes from v1.12.3 to v1.13.0 2018-12-03 20:44:32 -08:00
76d993cdae Add experimental kube-router CNI provider
* Add kube-router for pod networking and NetworkPolicy
as an experiment
* Experiments are not documented or supported in any way,
and may be removed without notice. They have known issues
and aren't enabled without special options.
2018-12-03 19:52:28 -08:00
b6016d0a26 Disable Grafana login form, admin user can't be disabled
* Example manifests aim to provide a read-only dashboard visible
to any users with network access (i.e. kubectl port-forward, LAN)
* Problem: Grafana always has an admin user, even with the user
management system disabled
* Disable the login form to prevent admin login
2018-11-28 22:04:08 -08:00
eec314b52f Update CHANGES changelog for release 2018-11-28 09:23:13 -08:00
bcce02a9ce Add Kubelet /etc/iscsi and iscsiadm mounts on bare-metal
* Allow using iSCSI with Container Linux bare-metal clusters
* Warning, iSCSI isn't part of Kubernetes conformance and isn't
regularly evaluated
2018-11-28 00:28:46 -08:00
42c523e6a2 Recommend switch from ~/.terraformrc to 3rd-party plugin dir
* Switch tutorials from using ~/.terraformrc to using the 3rd-party
plugin directory so 3rd-party plugins can be pinned
* Continue to show using terraform-provider-ct v0.2.2. Updating to
a newer version is only safe once all managed clusters are v1.12.2
or higher
2018-11-28 00:03:15 -08:00
64b4c10418 Improve features and modules list docs
* Remove bullet about isolating workloads on workers, its
now common practice and new users will assume it
* List advanced features available in each module
* Fix erroneous Kubernetes version listing for Google Cloud
Fedora Atomic
2018-11-26 22:58:00 -08:00
872b11b948 Update ngninx-ingress from v0.20.0 to v0.21.0
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.21.0
2018-11-26 21:57:34 -08:00
5b27d8d889 Update Kubernetes from v1.12.2 to v1.12.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.12.md/#v1123
2018-11-26 21:06:09 -08:00
840b73f9ba Update pod-checkpointer image to query Kubelet secure API
* Updates pod-checkpointer to prefer the Kubelet secure
API (before falling back to the Kubelet read-only API that
is disabled on Typhoon clusters since
https://github.com/poseidon/typhoon/pull/324)
* Previously, pod-checkpointer checkpointed an initial set
of pods during bootstrapping so recovery from power cycling
clusters was unaffected, but logs were noisy
* https://github.com/kubernetes-incubator/bootkube/pull/1027
* https://github.com/kubernetes-incubator/bootkube/pull/1025
2018-11-26 20:24:32 -08:00
915af3c6cc Fix Calico Felix reporting usage data, require opt-in
* Calico Felix has been reporting anonymous usage data about the
version and cluster size, which violates Typhoon's privacy policy
where analytics should be opt-in only
* Add a variable enable_reporting (default: false) to allow opting
in to reporting usage data to Calico (or future components)
2018-11-20 01:03:00 -08:00
c6586b69fd Use eviction policy Delete for Low priority VMSS workers
* Fix issue where Azure defaults to Deallocate eviction policy,
which required manually restarting deallocated workers
* Require terraform-provider-azurerm v1.19+ to support setting
the eviction_policy
2018-11-18 21:04:50 -08:00
ea3fc6d2a7 Update CoreDNS from v1.2.4 to v1.2.6
* https://coredns.io/2018/11/05/coredns-1.2.6-release/
2018-11-18 16:45:53 -08:00
c8c43f3991 Update Grafana from v5.3.2 to v5.3.4
* https://github.com/grafana/grafana/releases/tag/v5.3.3
* https://github.com/grafana/grafana/releases/tag/v5.3.4
2018-11-18 16:42:50 -08:00
58472438ce Update mkdocs-material theme from v3.0.6 to v3.1.0
* https://github.com/squidfunk/mkdocs-material/releases/tag/3.1.0
2018-11-18 16:08:00 -08:00
7f8e781ae4 Measure DigitalOcean network performance
* Measuring pod-to-pod bandwidth in a few regions (NYC3, FRA1,
SFO1) shows DigitalOcean has made some improvements
2018-11-11 21:08:10 -08:00
56e9a82984 Add flannel resource request and mount only /run/flannel 2018-11-11 20:35:21 -08:00
e95b856a22 Enable CoreDNS loop and loadbalance plugins
* loop sends an initial query to detect infinite forwarding
loops in configured upstream DNS servers and fast exit with
an error (its a fatal misconfiguration on the network that
will otherwise cause resolvers to consume memory/CPU until
crashing, masking the problem)
* https://github.com/coredns/coredns/tree/master/plugin/loop
* loadbalance randomizes the ordering of A, AAAA, and MX records
in responses to provide round-robin load balancing (as usual,
clients may still cache responses though)
* https://github.com/coredns/coredns/tree/master/plugin/loadbalance
2018-11-10 17:36:56 -08:00
31f48a81a8 Update docs to show flannel DaemonSet instead of kube-flannel
* No functional change, the rename is just for consistency
2018-11-10 15:16:06 -08:00
2b3f61d1bb Update Calico from v3.3.0 to v3.3.1
* Structure Calico and flannel manifests
* Rename kube-flannel mentions to just flannel
2018-11-10 13:37:12 -08:00
8fd2978c31 Update bootkube image version from v0.13.0 to v0.14.0
* https://github.com/kubernetes-incubator/bootkube/releases/tag/v0.14.0
2018-11-06 23:35:11 -08:00
7de03a1279 Fix Prometheus etcd scrape config for DigitalOcean
* Kubelet uses a node's hostname as the node name, which isn't
resolvable on DigitalOcean. On DigitalOcean, the node name was
set to the internal IP until #337 switched to instead configuring
kube-apiserver to prefer the InternalIP for communication
* Explicitly configure etcd scrapes to target each controller by
internal IP and port 2381 (replace __address__)
2018-11-06 23:02:45 -08:00
be9f7b87d6 Update Prometheus from v2.4.3 to v2.5.0
* https://github.com/prometheus/prometheus/releases/tag/v2.5.0
2018-11-06 22:16:12 -08:00
721c847943 Set kube-apiserver kubelet preferred address types
* Prefer InternalIP and ExternalIP over the node's hostname,
to match upstream behavior and kubeadm
* Previously, hostname-override was used to set node names
to internal IP's to work around some cloud providers not
resolving hostnames for instances (e.g. DO droplets)
2018-11-03 22:31:55 -07:00
78c9fdc18f Update mkdocs-material docs theme version 2018-10-28 19:45:58 -07:00
884c8b39dc Update Grafana from v5.3.1 to v5.3.2
* https://github.com/grafana/grafana/releases/tag/v5.3.2
2018-10-28 19:44:22 -07:00
0e71f7e565 Ignore controller user_data changes to allow plugin updates
* Updating the `terraform-provider-ct` plugin is known to produce
a `user_data` diff in all pre-existing clusters. Applying the
diff to pre-existing cluster destroys controller nodes
* Ignore changes to controller `user_data`. Once all managed
clusters use a release containing this change, it is possible
to update the `terraform-provider-ct` plugin (worker `user_data`
will still be modified)
* Changing the module `ref` for an existing cluster and
re-applying is still NOT supported (although this PR
would protect controllers from being destroyed)
2018-10-28 16:48:12 -07:00
8c4200d425 Add DigitalOcean AAAA DNS records on Fedora Atomic 2018-10-28 14:57:31 -07:00
5be5b261e2 Add an IPv6 address and forwarding rules on Google Cloud
* Allowing serving IPv6 applications via Kubernetes Ingress
on Typhoon Google Cloud clusters
* Add `ingress_static_ipv6` output variable for use in AAAA
DNS records
2018-10-28 14:30:58 -07:00
f034ef90ae Add DigitalOcean AAAA DNS records resolving to workers
* Improve the workers "round-robin" DNS FQDN that is created
with each cluster by adding AAAA records
* CNAME's resolving to the DigitalOcean `workers_dns` output
can be followed to find a droplet's IPv4 or IPv6 address
* The CNI portmap plugin doesn't support IPv6. Hosting IPv6
apps is possible, but requires editing the nginx-ingress
addon with `hostNetwork: true`
2018-10-27 23:09:24 -07:00
3bba1ba0dc Use new azurerm_network_interface_backend_address_pool_association
* Require terraform-provider-azurerm v1.17+
* Inline load_balancer_backend_address_pools_ids is deprecated
and scheduled for removal in the v2.0 provider
* https://github.com/terraform-providers/terraform-provider-azurerm/pull/2079
2018-10-27 22:55:05 -07:00
dbe7604b67 Add primary field to ip_configuration required by Azure
* Required by terraform-provider-azurerm v1.17+
* https://github.com/terraform-providers/terraform-provider-azurerm/pull/2035
2018-10-27 16:44:44 -07:00
9b405a19b2 Fix minor naming inconsistencies in Ignition and CLC data 2018-10-27 16:24:59 -07:00
bfa1a679eb Name AWS and DigitalOcean Ignition data sources consistently 2018-10-27 16:14:44 -07:00
f1da0731d8 Update Kubernetes from v1.12.1 to v1.12.2
* Update CoreDNS from v1.2.2 to v1.2.4
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.12.md#v1122
* https://coredns.io/2018/10/17/coredns-1.2.4-release/
* https://coredns.io/2018/10/16/coredns-1.2.3-release/
2018-10-27 15:47:57 -07:00
d641a058fe Update Calico from v3.2.3 to v3.3.0
* https://docs.projectcalico.org/v3.3/releases/
2018-10-23 20:30:30 -07:00
99a6d5478b Disable Kubelet read-only port 10255
* We can finally disable the Kubelet read-only port 10255!
* Journey: https://github.com/poseidon/typhoon/issues/322#issuecomment-431073073
2018-10-18 21:14:14 -07:00
bc750aec33 Configure Heapster to source metrics from Kubelet authenticated API
* Heapster can now get nodes (i.e. kubelets) from the apiserver and
source metrics from the Kubelet authenticated API (10250) instead of
the Kubelet HTTP read-only API (10255)
* https://github.com/kubernetes/heapster/blob/master/docs/source-configuration.md
* Use the heapster service account token via Kubelet bearer token
authn/authz.
* Permit Heapster to skip CA verification. The CA cert does not contain
IP SANs and cannot since nodes get random IPs that aren't known upfront.
Heapster obtains the node list from the apiserver, so the risk of
spoofing a node is limited. For the same reason, Prometheus scrapes
must skip CA verification for scraping Kubelet's provided by the apiserver.
* https://github.com/poseidon/typhoon/blob/v1.12.1/addons/prometheus/config.yaml#L68
* Create a heapster ClusterRole to work around the default Kubernetes
`system:heapster` ClusterRole lacking the proper GET `nodes/stats`
access. See https://github.com/kubernetes/heapster/issues/1936
2018-10-18 21:03:01 -07:00
d55bfd5589 Fix CoreDNS AntiAffinity spec to prefer spreading replicas
* Pods were still being scheduled at random due to a typo
2018-10-17 22:19:57 -07:00
0be4673e44 Add disk_iops variable for AWS
* Setting disk_iops is required for disk_type io1
* https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html#EBSVolumeTypes
2018-10-17 22:18:54 -07:00
3b44972d78 Add links to header to CHANGES 2018-10-17 09:08:58 -07:00
0127ee82c1 Update nginx-ingress from v0.19.0 to v0.20.0 2018-10-16 21:35:29 -07:00
a10d6977b8 Update Prometheus from v2.4.2 to v2.4.3
* https://github.com/prometheus/prometheus/releases/tag/v2.4.3
2018-10-16 21:29:41 -07:00
05fe923c14 Update Grafana from v5.3.0 to v5.3.1
* https://github.com/grafana/grafana/releases/tag/v5.3.1
2018-10-16 21:23:44 -07:00
d10620fb58 Add support for Flatcar Linux bare-metal cached_install
* Support bare-metal cached_install=true mode with Flatcar Linux
where assets are fetched from the Matchbox assets cache instead
of from the upstream Flatcar download server
* Skipped in original Flatcar support to keep it simple
https://github.com/poseidon/typhoon/pull/209
2018-10-16 21:15:24 -07:00
9b6113a058 Update Kubernetes from v1.11.3 to v1.12.1
* Mount an empty dir for the controller-manager to work around
https://github.com/kubernetes/kubernetes/issues/68973
* Update coreos/pod-checkpointer to strip affinity from
checkpointed pod manifests. Kubernetes v1.12.0-rc.1 introduced
a default affinity that appears on checkpointed manifests; but
it prevented scheduling and checkpointed pods should not have an
affinity, they're run directly by the Kubelet on the local node
* https://github.com/kubernetes-incubator/bootkube/issues/1001
* https://github.com/kubernetes/kubernetes/pull/68173
2018-10-16 20:28:13 -07:00
5eb4078d68 Add docker/default seccomp to control plane and addons
* Annotate pods, deployments, and daemonsets to start containers
with the Docker runtime's default seccomp profile
* Overrides Kubernetes default behavior which started containers
with seccomp=unconfined
* https://docs.docker.com/engine/security/seccomp/#pass-a-profile-for-a-container
2018-10-16 20:07:29 -07:00
8f0d2b5db4 Update Grafana from v5.2.4 to v5.3.0 2018-10-13 23:03:31 -07:00
2e89e161e9 Remove Azure admin_password (disabled) now that its optional
* Requires terraform-provider-azurerm v1.16.0 or higher
https://github.com/terraform-providers/terraform-provider-azurerm/pull/1958
2018-10-13 22:40:58 -07:00
55bb4dfba6 Raise CoreDNS replica count to 2 or more
* Run at least two replicas of CoreDNS to better support
rolling updates (previously, kube-dns had a pod nanny)
* On multi-master clusters, set the CoreDNS replica count
to match the number of masters (e.g. a 3-master cluster
previously used replicas:1, now replicas:3)
* Add AntiAffinity preferred rule to favor distributing
CoreDNS pods across controller nodes nodes
2018-10-13 20:31:29 -07:00
43fe78a2cc Raise scheduler/controller-manager replicas in multi-master
* Continue to ensure scheduler and controller-manager run
at least two replicas to support performing kubectl edits
on single-master clusters (no change)
* For multi-master clusters, set scheduler / controller-manager
replica count to the number of masters (e.g. a 3-master cluster
previously used replicas:2, now replicas:3)
2018-10-13 16:16:29 -07:00
5a283b6443 Update etcd from v3.3.9 to v3.3.10
* https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.3.md#v3310-2018-10-10
2018-10-13 13:14:37 -07:00
ff0c271d7b Fix Calico network policy docs link 2018-10-07 21:02:16 +02:00
89088b6a99 Fix broken README links 2018-10-07 20:56:13 +02:00
ee569e3a59 Change rooted links to absolute links for mkdocs v1.0
* mkdocs v1.0 now evaluates links beginning with / as
absolute links that are only prepended with the site url
2018-10-07 20:51:54 +02:00
daeaec0832 Fix requirements for docs builds 2018-10-07 19:26:55 +02:00
db36036c81 Require terraform-provider-digitalocean plugin ~> 1.0
* Require a terraform-provider-digitalocean plugin version of
1.0 or higher within the same major version (e.g. allow 1.1 but
not 2.0)
* Change requirement from ~> 0.1.2 (which allowed up to but not
including 1.0 release)
2018-10-02 17:09:19 +02:00
7653e511be Update CoreDNS and Calico versions
* Update CoreDNS from 1.1.3 to 1.2.2
* Update Calico from v3.2.1 to v3.2.3
2018-10-02 16:07:48 +02:00
f8474d68c9 Fix typo in maintenance docs 2018-09-25 03:07:00 -07:00
032a24133b Update Prometheus from v2.3.2 to v2.4.2
* https://github.com/prometheus/prometheus/releases/tag/v2.4.0
* https://github.com/prometheus/prometheus/releases/tag/v2.4.1
* https://github.com/prometheus/prometheus/releases/tag/v2.4.2
2018-09-21 22:27:11 -07:00
f5f442b658 Improve the issue template prompt
* Clarify that reporting versions as latest is not
helpful
2018-09-15 10:05:07 -07:00
ad871dbfa9 Update Kubernetes from v1.11.2 to v1.11.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1113
2018-09-13 18:50:41 -07:00
5801be53ff Fix typo in ingress addon docs 2018-09-08 18:28:42 -07:00
c1b1669cf8 Fix Google Cloud preemption link in docs 2018-09-08 18:28:25 -07:00
dc03f7a4a9 Update nginx-ingress from 0.17.1 to 0.19.0
* If using --enable-ssl-passthrough or exposing TCP/UDP services,
be aware of https://github.com/kubernetes/ingress-nginx/pull/3038
* Workarounds until the fix merges are to stay on 0.17.1, use the
suggested development image, or revert to securityContext
`runAsNonRoot: false` for a while (less secure)
2018-09-08 17:57:01 -07:00
1b8234eb91 Update Grafana from v5.2.2 to v5.2.4
* https://github.com/grafana/grafana/releases/tag/v5.2.3
* https://github.com/grafana/grafana/releases/tag/v5.2.4
2018-09-08 15:41:20 -07:00
4ba090feb0 Update kube-state-metrics from v1.3.1 to v1.4.0 2018-08-29 09:37:50 -07:00
4882fe1053 Add docs for Azure Ingress and worker pools
* Azure worker pools must be in the same region as
the cluster itself unfortunately
2018-08-27 23:30:56 -07:00
019009e9ee Add outputs for Azure ingress IPv4 and worker pools 2018-08-27 23:30:32 -07:00
991a5c6cee Add new tutorial docs and links 2018-08-27 23:30:32 -07:00
c60ec642bc Fix Azure delete-node script to lowercase hostnames
* Fix issue where worker nodes didn't delete themselves on
scale-down or deallocation (e.g. low priority instances).
Lowercase the hostname and delete the Kubernetes node
* Kubelet registers the lowercase hostname as the node name,
but Azure workers get hostname CLUSTER-worker-GENERATED where
the generated identifier may contain uppercase characters
2018-08-27 23:30:32 -07:00
38b4ff4700 Add module for Typhoon Azure with Container Linux 2018-08-27 23:30:32 -07:00
7eb09237f4 Update Calico from v3.1.3 to v3.2.1
* Add new bird and felix readiness checks
* Read MTU from ConfigMap veth_mtu
* Add RBAC read for serviceaccounts
* Remove invalid description from CRDs
2018-08-25 17:53:11 -07:00
e58b424882 Fix firewall to allow etcd client traffic between controllers
* Broaden internal-etcd firewall rule to allow etcd client
traffic (2379) from other controller nodes
* Previously, kube-apiservers were only able to connect to their
node's local etcd peer. While master node outages were tolerated,
reaching a healthy peer took longer than neccessary in some cases
* Reduce time needed to bootstrap a cluster
2018-08-21 23:51:40 -07:00
b8eeafe4f9 Template etcd_servers list to replace null_resource.repeat
* Remove the last usage of null_resource.repeat, which has
always been an eyesore for creating the etcd server list
* Originally, #224 switched to templating the etcd_servers
list for all clouds, but had to revert on GCP in #237
* https://github.com/poseidon/typhoon/pull/224
* https://github.com/poseidon/typhoon/pull/237
2018-08-21 22:46:24 -07:00
bdf1e6986e Fix terraform fmt 2018-08-21 21:59:55 -07:00
99e3721181 Mention Fedora Atomic system container artifacts
* Typhoon for Fedora Atomic uses system containers, container
images containing metadata, but built directly from upstream
and published and serve through Quay.io
* https://github.com/poseidon/system-containers
2018-08-21 21:52:52 -07:00
ea365b551a Fix docs mentions of ELBs to NLBs
* Typhoon AWS clusters use an NLB rather than an ELB,
since v1.10.5
* Add a few missing links in CHANGES
2018-08-21 21:40:06 -07:00
bbf2c13eef Remove AWS security rule allowing ICMP packets to nodes
* Deny ICMP packets for consistency across Typhoon clusters on
various clouds and because there isn't much need to allow them
2018-08-21 21:16:16 -07:00
da5d2c5321 Remove GCP firewall rule allowing Nginx Ingress health
* Nginx Ingress addon no longer uses hostNework so Prometheus may
scrape port 10254 via the CNI network, rather than via the host
address
2018-08-21 21:06:03 -07:00
bceec9fdf5 Sort firewall / security rules and add comments
* No functional changes to network firewalls
2018-08-21 20:53:16 -07:00
49a9dc9b8b Fix typo in Prometheus alerting rules 2018-08-21 16:55:49 -07:00
bec5250e73 Remove unofficial bare-metal *_networkds variables
* Remove controller_networkds and worker_networkds variables. These
variables were always listed as experimental, unsupported, and excluded
from documentation in anticipation of Container Linux Config snippets
* Use Container Linux Config snippets on bare-metal instead. They
provide safer, more powerful, and more elegant host customization
2018-08-13 23:33:29 -07:00
dbdc3fc850 Add nginx-ingress addon manifests for bare-metal 2018-08-11 12:14:23 -07:00
e00f97c578 Update nginx-ingress from 0.16.2 to 0.17.1
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.17.1
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.17.0
2018-08-08 00:45:20 -07:00
f7ebdf475d Update Kubernetes from v1.11.1 to v1.11.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1112
2018-08-07 21:57:25 -07:00
716dfe4d17 Fix cluster links in customization docs 2018-07-29 12:46:59 -07:00
edc250d62a Fix Kublet version for Fedora Atomic modules
* Release v1.11.1 erroneously left Fedora Atomic clusters using
the v1.11.0 Kubelet. The rest of the control plane ran v1.11.1
as expected
* Update Kubelet from v1.11.0 to v1.11.1 so Fedora Atomic matches
Container Linux
* Container Linux modules were not affected
2018-07-29 12:13:29 -07:00
db64ce3312 Update etcd from v3.3.8 to v3.3.9
* https://github.com/coreos/etcd/blob/master/CHANGELOG-3.3.md#v339-2018-07-24
2018-07-29 11:27:37 -07:00
7c327b8bf4 Update from bootkube v0.12.0 to v0.13.0 2018-07-29 11:20:17 -07:00
e6720cf738 Update heapster from v1.5.3 to v1.5.4
* https://github.com/kubernetes/heapster/releases/tag/v1.5.4
2018-07-29 11:19:57 -07:00
844f380b4e Update Grafana from v5.2.1 to v5.2.2
* https://github.com/grafana/grafana/releases/tag/v5.2.2
2018-07-29 11:12:56 -07:00
13beb13aab Add descriptions to bare-metal fedora-atomic variables 2018-07-29 11:07:48 -07:00
90c4a7483d Combine bare-metal CLC snippets maps into one map 2018-07-26 23:31:08 -07:00
4e7dfc115d Support Container Linux Config snippets on bare-metal 2018-07-25 23:14:54 -07:00
ec5ea51141 Remove old migration docs and fix link 2018-07-22 12:23:37 -07:00
d8d524d10b Update Kubernetes from v1.11.0 to v1.11.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1111
2018-07-20 00:41:27 -07:00
02cd8eb8d3 Update Prometheus from v2.3.1 to v2.3.2
* https://github.com/prometheus/prometheus/releases/tag/v2.3.2
2018-07-14 14:25:49 -07:00
84d6cfe7b3 Add Prometheus alert rule for inactive md devices
* node-exporter exposes metrics to Prometheus about total and
active md devices (e.g. disks in mdadm RAID arrays)
* Add alert that fires when a RAID disk fails or becomes inactive
for another reason
2018-07-10 00:20:30 -07:00
3352388fe6 Update changelog and docs for release 2018-07-04 12:28:25 -07:00
915f89d3c8 Update Fedora Atomic from 27 to 28 on bare-metal 2018-07-04 11:41:54 -07:00
f40f60b83c Update Nginx Ingress controller from 0.15.0 to 0.16.2
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.16.2
* https://github.com/kubernetes/ingress-nginx/blob/master/Changelog.md
2018-07-02 22:06:22 -07:00
6f958d7577 Replace kube-dns with CoreDNS
* Add system:coredns ClusterRole and binding
* Annotate CoreDNS for Prometheus metrics scraping
* Remove kube-dns deployment, service, & service account
* https://github.com/poseidon/terraform-render-bootkube/pull/71
* https://kubernetes.io/blog/2018/06/27/kubernetes-1.11-release-announcement/
2018-07-01 22:55:01 -07:00
ee31074679 Promote Typhoon Google Cloud for Container Linux to stable 2018-07-01 22:52:27 -07:00
97517fa7f3 Fix ingress addons docs to use ingress_static_ipv4 var 2018-07-01 22:48:41 -07:00
18502d64d6 Update Fedora Atomic from 27 to 28 on GCP 2018-07-01 22:46:51 -07:00
a3349b5c68 Update heapster from v1.5.2 to v1.5.3 2018-07-01 21:07:52 -07:00
74dc6b0bf9 Update Grafana from 5.1.4 to 5.2.1
* http://docs.grafana.org/guides/whats-new-in-v5-2/
* https://github.com/grafana/grafana/releases/tag/v5.2.0
* https://github.com/grafana/grafana/releases/tag/v5.2.1
2018-07-01 20:55:34 -07:00
fd1de27aef Remove deprecated ingress_static_ip and controllers_ipv4_public outputs 2018-07-01 20:47:46 -07:00
93de7506ef Update Fedora Atomic from 27 to 28 on AWS 2018-06-30 18:55:18 -07:00
def445a344 Update Fedora Atomic kubelet from v1.10.5 to v1.11.0 2018-06-30 16:45:42 -07:00
8464b258d8 Update Kubernetes from v1.10.5 to v1.11.0
* Force apiserver to stop listening on 127.0.0.1:8080
* Remove deprecated Kubelet `--allow-privileged`. Defaults to
true. Use `PodSecurityPolicy` if limiting is desired
* https://github.com/kubernetes/kubernetes/releases/tag/v1.11.0
* https://github.com/poseidon/terraform-render-bootkube/pull/68
2018-06-27 22:47:35 -07:00
855aec5af3 Clarify AWS module output names and changes 2018-06-23 15:29:13 -07:00
0c4d59db87 Use global HTTP/TCP proxy load balancing for Ingress on GCP
* Switch Ingress from regional network load balancers to global
HTTP/TCP Proxy load balancing
* Reduce cost by ~$19/month per cluster. Google bills the first 5
global and regional forwarding rules separately. Typhoon clusters now
use 3 global and 0 regional forwarding rules.
* Worker pools no longer include an extraneous load balancer. Remove
worker module's `ingress_static_ip` output.
* Add `ingress_static_ipv4` output variable
* Add `worker_instance_group` output to allow custom global load
balancing
* Deprecate `controllers_ipv4_public` module output
* Deprecate `ingress_static_ip` module output. Use `ingress_static_ipv4`
2018-06-23 14:37:40 -07:00
2eaf04c68b Drop hostNetwork from nginx-ingress addon
* Both flannel and Calico support host port via `portmap`
* Allows writing NetworkPolicies that reference ingress pods in `from`
or `to`. HostNetwork pods were difficult to write network policy for
since they could circumvent the CNI network to communicate with pods on
the same node.
2018-06-22 00:46:41 -07:00
0227014fa0 Fix terraform formatting 2018-06-22 00:28:36 -07:00
fb6f40051f Disable AWS detailed monitoring on worker nodes
* Basic monitoring (free) is sufficient for casual console browsing
* Detailed monitoring (paid) is not leveraged for CloudWatch anyway
* Favor Prometheus for cloud-agnostic metrics, aggregation, and alerting
2018-06-22 00:26:06 -07:00
316f06df06 Combine NLBs to use one NLB per cluster
* Simplify clusters to come with a single NLB
* Listen for apiserver traffic on port 6443 and forward
to controllers (with healthy apiserver)
* Listen for ingress traffic on ports 80/443 and forward
to workers (with healthy ingress controller)
* Reduce cost of default clusters by 1 NLB ($18.14/month)
* Keep using CNAME records to the `ingress_dns_name` NLB and
the nginx-ingress addon for Ingress (up to a few million RPS)
* Users with heavy traffic (many million RPS) can create their
own separate NLB(s) for Ingress and use the new output worker
target groups
* Fix issue where additional worker pools come with an
extraneous network load balancer
2018-06-21 23:46:57 -07:00
f4d3059b00 Update Kubernetes from v1.10.4 to v1.10.5
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.10.md#v1105
2018-06-21 22:51:39 -07:00
6c5a1964aa Change kube-apiserver port from 443 to 6443
* Adjust firewall rules, security groups, cloud load balancers,
and generated kubeconfig's
* Facilitates some future simplifications and cost reductions
* Bare-Metal users who exposed kube-apiserver on a WAN via their
router or load balancer will need to adjust its configuration.
This is uncommon, most apiserver are on LAN and/or behind VPN
so no routing infrastructure is configured with the port number
2018-06-19 23:48:51 -07:00
6e64634748 Update etcd from v3.3.7 to v3.3.8
* https://github.com/coreos/etcd/releases/tag/v3.3.8
2018-06-19 21:56:21 -07:00
d5de41e07a Update Grafana from 5.1.3 to 5.1.4
* https://github.com/grafana/grafana/releases/tag/v5.1.4
2018-06-19 21:45:15 -07:00
05b99178ae Update prometheus from v2.3.0 to v2.3.1
* https://github.com/prometheus/prometheus/releases/tag/v2.3.1
2018-06-19 21:43:50 -07:00
ed0b781296 Fix possible deadlock for provisioning bare-metal clusters
* Closes #235
2018-06-14 23:15:28 -07:00
51906bf398 Update etcd from v3.3.6 to v3.3.7 2018-06-14 22:46:16 -07:00
18dd7ccc09 Update CLUO from v0.6.0 to v0.7.0 2018-06-14 22:32:36 -07:00
0764bd30b5 Fix typo in AWS MTU tip for using jumbo packets 2018-06-11 18:11:50 -07:00
899424c94f Update mkdocs and docs theme 2018-06-10 12:24:29 -07:00
ca8c0a7ac0 Remove docs site analytics feature 2018-06-10 11:57:39 -07:00
cbe646fba6 Label namespaces to ease writing Network Policies 2018-06-09 11:45:11 -07:00
c166b2ba33 Update prometheus from v2.2.1 to v2.3.0 2018-06-09 11:43:10 -07:00
6676484490 Partially revert b7ed6e7bd35cee39a3f65b47e731938c3006b5cd
* Fix change that broke Google Cloud container-linux and
fedora-atomic https://github.com/poseidon/typhoon/pull/224
2018-06-06 23:48:37 -07:00
79260c48f6 Update Kubernetes from v1.10.3 to v1.10.4 2018-06-06 23:23:11 -07:00
589c3569b7 Update etcd from v3.3.5 to v3.3.6
* https://github.com/coreos/etcd/releases/tag/v3.3.6
2018-06-06 23:19:30 -07:00
4d75ae1373 Recommend against AWS controllers smaller than t2.small 2018-05-30 22:57:15 -07:00
d32e6797ae Annotate Grafana so Prometheus scrapes metrics 2018-05-30 22:37:47 -07:00
32a9a83190 Add Prometheus liveness and readiness probes 2018-05-30 22:34:07 -07:00
6e968cd152 Update Calico from v3.1.2 to v3.1.3
* https://github.com/projectcalico/calico/releases/tag/v3.1.3
* https://github.com/projectcalico/cni-plugin/releases/tag/v3.1.3
2018-05-30 21:32:12 -07:00
6a581ab577 Render etcd_initial_cluster using a template_file 2018-05-30 21:14:49 -07:00
189 changed files with 4718 additions and 1456 deletions

View File

@ -4,11 +4,11 @@
### Environment
* Platform: aws, bare-metal, google-cloud, digital-ocean
* OS: container-linux, fedora-atomic
* Terraform: `terraform version`
* Plugins: Provider plugin versions
* Ref: Git SHA (if applicable)
* Platform: aws, azure, bare-metal, google-cloud, digital-ocean
* OS: container-linux, flatcar-linux, or fedora-atomic
* Release: Typhoon version or Git SHA (reporting latest is **not** helpful)
* Terraform: `terraform version` (reporting latest is **not** helpful)
* Plugins: Provider plugin versions (reporting latest is **not** helpful)
### Problem

View File

@ -4,6 +4,314 @@ Notable changes between versions.
## Latest
* Kubernetes [v1.13.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1132)
* Add ServiceAccounts for `kube-apiserver` and `kube-scheduler` ([#370](https://github.com/poseidon/typhoon/pull/370))
* Use lower-privilege TLS client certificates for Kubelets ([#372](https://github.com/poseidon/typhoon/pull/372))
* Use HTTPS liveness probes for `kube-scheduler` and `kube-controller-manager` ([#377](https://github.com/poseidon/typhoon/pull/377))
* Update CoreDNS from v1.2.6 to [v1.3.0](https://coredns.io/2018/12/15/coredns-1.3.0-release/)
* Allow the `certificates.k8s.io` API to issue certificates signed by the cluster CA ([#376](https://github.com/poseidon/typhoon/pull/376))
* Configure controller manager to sign CSRs that are manually [approved](https://kubernetes.io/docs/tasks/tls/managing-tls-in-a-cluster) by an administrator
#### AWS
* Change `controller_type` and `worker_type` default from t2.small to t3.small ([#365](https://github.com/poseidon/typhoon/pull/365))
* t3.small is cheaper, provides 2 vCPU (instead of 1), and 5 Gbps of pod-to-pod bandwidth!
#### Bare-Metal
* Remove the `kubeconfig` output variable
#### Addons
* Update Prometheus from v2.5.0 to v2.6.0
## v1.13.1
* Kubernetes [v1.13.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1131)
* Update Calico from v3.3.2 to [v3.4.0](https://docs.projectcalico.org/v3.4/releases/) ([#362](https://github.com/poseidon/typhoon/pull/362))
* Install CNI plugins with an init container rather than a sidecar
* Improve the `calico-node` ClusterRole
* Recommend updating `terraform-provider-ct` plugin from v0.2.1 to v0.3.0 ([#363](https://github.com/poseidon/typhoon/pull/363))
* [Migration](https://typhoon.psdn.io/topics/maintenance/#upgrade-terraform-provider-ct) instructions for upgrading `terraform-provider-ct` in-place for v1.12.2+ clusters (**action required**)
* [Require](https://typhoon.psdn.io/topics/maintenance/#terraform-plugins-directory) switching from `~/.terraformrc` to the Terraform [third-party plugins](https://www.terraform.io/docs/configuration/providers.html#third-party-plugins) directory `~/.terraform.d/plugins/`
* Require Container Linux 1688.5.3 or newer
#### Google Cloud
* Increase TCP proxy apiserver backend service timeout from 1 minute to 5 minutes ([#361](https://github.com/poseidon/typhoon/pull/361))
* Align `port-forward` behavior closer to AWS/Azure (no timeout)
#### Addons
* Update Grafana from v5.4.0 to v5.4.2
## v1.13.0
* Kubernetes [v1.13.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1130)
* Update Calico from v3.3.1 to [v3.3.2](https://docs.projectcalico.org/v3.3/releases/)
#### Addons
* Update Grafana from v5.3.4 to v5.4.0
* Disable Grafana login form, since admin user can't be disabled ([#352](https://github.com/poseidon/typhoon/pull/352))
* Example manifests aim to provide a read-only dashboard view
## v1.12.3
* Kubernetes [v1.12.3](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.12.md#v1123)
* Add `enable_reporting` variable (default "false") to provide upstreams with usage data ([#345](https://github.com/poseidon/typhoon/pull/345))
* Change kube-apiserver `--kubelet-preferred-address-types` to InternalIP,ExternalIP,Hostname
* Update Calico from v3.3.0 to [v3.3.1](https://docs.projectcalico.org/v3.3/releases/)
* Disable Felix usage reporting by default ([#345](https://github.com/poseidon/typhoon/pull/345))
* Improve flannel manifests
* [Rename](https://github.com/poseidon/terraform-render-bootkube/commit/d045a8e6b8eccfbb9d69bb51953b5a93d23f67f7) `kube-flannel` DaemonSet to `flannel` and `kube-flannel-cfg` ConfigMap to `flannel-config`
* [Drop](https://github.com/poseidon/terraform-render-bootkube/commit/39f9afb3360ec642e5b98457c8bd07eda35b6c96) unused mounts and add a CPU resource request
* Update CoreDNS from v1.2.4 to [v1.2.6](https://coredns.io/2018/11/05/coredns-1.2.6-release/)
* Enable CoreDNS `loop` and `loadbalance` plugins ([#340](https://github.com/poseidon/typhoon/pull/340))
* Fix pod-checkpointer log noise and checkpointable pods detection ([#346](https://github.com/poseidon/typhoon/pull/346))
* Use kubernetes-incubator/bootkube v0.14.0
* [Recommend](https://typhoon.psdn.io/topics/maintenance/#terraform-plugins-directory) switching from `~/.terraformrc` to the Terraform [third-party plugins](https://www.terraform.io/docs/configuration/providers.html#third-party-plugins) directory `~/.terraform.d/plugins/`.
* Allows pinning `terraform-provider-ct` and `terraform-provider-matchbox` versions
* Improves safety of later plugin version migrations
#### Azure
* Use eviction policy `Delete` for `Low` priority virtual machine scale set workers ([#343](https://github.com/poseidon/typhoon/pull/343))
* Fix issue where Azure defaults to `Deallocate` eviction policy, which required manually restarting deallocated instances. `Delete` policy aligns Azure with AWS and GCP behavior.
* Require `terraform-provider-azurerm` v1.19+ (action required)
#### Bare-Metal
* Add Kubelet `/etc/iscsi` and `iscsadm` mounts on bare-metal for iSCSI ([#103](https://github.com/poseidon/typhoon/pull/103))
#### Addons
* Update nginx-ingress from v0.20.0 to v0.21.0
* Update Prometheus from v2.4.3 to v2.5.0
* Update Grafana from v5.3.2 to v5.3.4
## v1.12.2
* Kubernetes [v1.12.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.12.md#v1122)
* Update CoreDNS from 1.2.2 to [1.2.4](https://github.com/coredns/coredns/releases/tag/v1.2.4)
* Update Calico from v3.2.3 to [v3.3.0](https://docs.projectcalico.org/v3.3/releases/)
* Disable Kubelet read-only port ([#324](https://github.com/poseidon/typhoon/pull/324))
* Fix CoreDNS AntiAffinity spec to prefer spreading replicas
* Ignore controller node user-data changes ([#335](https://github.com/poseidon/typhoon/pull/335))
* Once all managed clusters use v1.12.2, it is possible to update `terraform-provider-ct`
#### AWS
* Add `disk_iops` variable for EBS volume IOPS ([#314](https://github.com/poseidon/typhoon/pull/314))
#### Azure
* Use new `azurerm_network_interface_backend_address_pool_association` ([#332](https://github.com/poseidon/typhoon/pull/332))
* Require `terraform-provider-azurerm` v1.17+ (action required)
* Add `primary` field to `ip_configuration` needed by v1.17+ ([#331](https://github.com/poseidon/typhoon/pull/331))
#### DigitalOcean
* Add AAAA DNS records resolving to worker nodes ([#333](https://github.com/poseidon/typhoon/pull/333))
* Hosting IPv6 apps requires editing nginx-ingress with `hostNetwork: true`
#### Google Cloud
* Add an IPv6 address and IPv6 forwarding rules for load balancing IPv6 Ingress ([#334](https://github.com/poseidon/typhoon/pull/334))
* Add `ingress_static_ipv6` output variable for use in AAAA DNS records
* Allow serving IPv6 applications via Kubernetes Ingress
#### Addons
* Configure Heapster to scrape Kubelets with bearer token auth ([#323](https://github.com/poseidon/typhoon/pull/323))
* Update Grafana from v5.3.1 to v5.3.2
## v1.12.1
* Kubernetes [v1.12.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.12.md#v1121)
* Update etcd from v3.3.9 to [v3.3.10](https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.3.md#v3310-2018-10-10)
* Update CoreDNS from 1.1.3 to [1.2.2](https://github.com/coredns/coredns/releases/tag/v1.2.2)
* Update Calico from v3.2.1 to [v3.2.3](https://docs.projectcalico.org/v3.2/releases/)
* Raise scheduler and controller-manager replicas to the larger of 2 or the number of controller nodes ([#312](https://github.com/poseidon/typhoon/pull/312))
* Single-controller clusters continue to run 2 replicas as before
* Raise default CoreDNS replicas to the larger of 2 or the number of controller nodes ([#313](https://github.com/poseidon/typhoon/pull/313))
* Add AntiAffinity preferred rule to favor spreading CoreDNS pods
* Annotate control plane and addon containers to use the Docker runtime seccomp profile ([#319](https://github.com/poseidon/typhoon/pull/319))
* Override Kubernetes default behavior that starts containers with `seccomp=unconfined`
#### Azure
* Remove `admin_password` field (disabled) since it is now optional
* Require `terraform-provider-azurerm` v1.16+ (action required)
#### Bare-Metal
* Add support for `cached_install` mode with Flatcar Linux ([#315](https://github.com/poseidon/typhoon/pull/315))
#### DigitalOcean
* Require `terraform-provider-digitalocean` v1.0+ (action required)
#### Addons
* Update nginx-ingress from v0.19.0 to v0.20.0
* Update Prometheus from v2.3.2 to v2.4.3
* Update Grafana from v5.2.4 to v5.3.1
## v1.11.3
* Kubernetes [v1.11.3](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1113)
* Introduce Typhoon for Azure as alpha ([#288](https://github.com/poseidon/typhoon/pull/288))
* Special thanks @justaugustus for an earlier variant
* Update Calico from v3.1.3 to v3.2.1 ([#278](https://github.com/poseidon/typhoon/pull/278))
#### AWS
* Remove firewall rule allowing ICMP packets to nodes ([#285](https://github.com/poseidon/typhoon/pull/285))
#### Bare-Metal
* Remove `controller_networkds` and `worker_networkds` variables. Use Container Linux Config snippets [#277](https://github.com/poseidon/typhoon/pull/277)
#### Google Cloud
* Fix firewall to allow etcd client port 2379 traffic between controller nodes ([#287](https://github.com/poseidon/typhoon/pull/287))
* kube-apiservers were only able to connect to their node's local etcd peer. While master node outages were tolerated, reaching a healthy peer took longer than neccessary in some cases
* Reduce time needed to bootstrap the cluster
* Remove firewall rule allowing workers to access Nginx Ingress health check ([#284](https://github.com/poseidon/typhoon/pull/284))
* Nginx Ingress addon no longer uses hostNetwork, Prometheus scrapes via CNI network
#### Addons
* Update nginx-ingress from 0.17.1 to 0.19.0
* Update kube-state-metrics from v1.3.1 to v1.4.0
* Update Grafana from 5.2.2 to 5.2.4
## v1.11.2
* Kubernetes [v1.11.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1112)
* Update etcd from v3.3.8 to [v3.3.9](https://github.com/coreos/etcd/blob/master/CHANGELOG-3.3.md#v339-2018-07-24)
* Use kubernetes-incubator/bootkube v0.13.0
* Fix Fedora Atomic modules' Kubelet version ([#270](https://github.com/poseidon/typhoon/issues/270))
#### Bare-Metal
* Introduce [Container Linux Config snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) on bare-metal
* Validate and additively merge custom Container Linux Configs during terraform plan
* Define files, systemd units, dropins, networkd configs, mounts, users, and more
* [Require](https://typhoon.psdn.io/cl/bare-metal/#terraform-setup) `terraform-provider-ct` plugin v0.2.1 (**action required!**)
#### Addons
* Update nginx-ingress from 0.16.2 to 0.17.1
* Add nginx-ingress manifests for bare-metal
* Update Grafana from 5.2.1 to 5.2.2
* Update heapster from v1.5.3 to v1.5.4
## v1.11.1
* Kubernetes [v1.11.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1111)
#### Addons
* Update Prometheus from v2.3.1 to v2.3.2
#### Errata
* Fedora Atomic modules shipped with Kubelet v1.11.0, instead of v1.11.1. Fixed in [#270](https://github.com/poseidon/typhoon/issues/270).
## v1.11.0
* Kubernetes [v1.11.0](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1110)
* Force apiserver to stop listening on `127.0.0.1:8080`
* Replace `kube-dns` with [CoreDNS](https://coredns.io/) ([#261](https://github.com/poseidon/typhoon/pull/261))
* Edit the `coredns` ConfigMap to [customize](https://coredns.io/plugins/)
* CoreDNS doesn't use a resizer. For large clusters, scaling may be required.
#### AWS
* Update from Fedora Atomic 27 to 28 ([#258](https://github.com/poseidon/typhoon/pull/258))
#### Bare-Metal
* Update from Fedora Atomic 27 to 28 ([#263](https://github.com/poseidon/typhoon/pull/263))
#### Google
* Promote Google Cloud to stable
* Update from Fedora Atomic 27 to 28 ([#259](https://github.com/poseidon/typhoon/pull/259))
* Remove `ingress_static_ip` module output. Use `ingress_static_ipv4`.
* Remove `controllers_ipv4_public` module output.
#### Addons
* Update nginx-ingress from 0.15.0 to 0.16.2
* Update Grafana from 5.1.4 to [5.2.1](http://docs.grafana.org/guides/whats-new-in-v5-2/)
* Update heapster from v1.5.2 to v1.5.3
## v1.10.5
* Kubernetes [v1.10.5](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.10.md#v1105)
* Update etcd from v3.3.6 to v3.3.8 ([#243](https://github.com/poseidon/typhoon/pull/243), [#247](https://github.com/poseidon/typhoon/pull/247))
#### AWS
* Switch `kube-apiserver` port from 443 to 6443 ([#248](https://github.com/poseidon/typhoon/pull/248))
* Combine apiserver and ingress NLBs ([#249](https://github.com/poseidon/typhoon/pull/249))
* Reduce cost by ~$18/month per cluster. Typhoon AWS clusters now use one network load balancer.
* Ingress addon users may keep using CNAME records to the `ingress_dns_name` module output (few million RPS)
* Ingress users with heavy traffic (many million RPS) should create a separate NLB(s)
* Worker pools no longer include an extraneous load balancer. Remove worker module's `ingress_dns_name` output
* Disable detailed (paid) monitoring on worker nodes ([#251](https://github.com/poseidon/typhoon/pull/251))
* Favor Prometheus for cloud-agnostic metrics, aggregation, and alerting
* Add `worker_target_group_http` and `worker_target_group_https` module outputs to allow custom load balancing
* Add `target_group_http` and `target_group_https` worker module outputs to allow custom load balancing
#### Bare-Metal
* Switch `kube-apiserver` port from 443 to 6443 ([#248](https://github.com/poseidon/typhoon/pull/248))
* Users who exposed kube-apiserver on a WAN via their router/load-balancer will need to adjust its configuration (e.g. DNAT 6443). Most apiservers are on a LAN (internal, VPN-only, etc) so if you didn't specially configure network gear for 443, no change is needed. (possible action required)
* Fix possible deadlock when provisioning clusters larger than 10 nodes ([#244](https://github.com/poseidon/typhoon/pull/244))
#### DigitalOcean
* Switch `kube-apiserver` port from 443 to 6443 ([#248](https://github.com/poseidon/typhoon/pull/248))
* Update firewall rules and generated kubeconfig's
#### Google Cloud
* Use global HTTP and TCP proxy load balancing for Kubernetes Ingress ([#252](https://github.com/poseidon/typhoon/pull/252))
* Switch Ingress from regional network load balancers to global HTTP/TCP Proxy load balancing
* Reduce cost by ~$19/month per cluster. Google bills the first 5 global and regional forwarding rules separately. Typhoon clusters now use 3 global and 0 regional forwarding rules.
* Worker pools no longer include an extraneous load balancer. Remove worker module's `ingress_static_ip` output
* Allow using nginx-ingress addon on Fedora Atomic clusters ([#200](https://github.com/poseidon/typhoon/issues/200))
* Add `worker_instance_group` module output to allow custom global load balancing
* Add `instance_group` worker module output to allow custom global load balancing
* Deprecate `ingress_static_ip` module output. Add `ingress_static_ipv4` module output instead.
* Deprecate `controllers_ipv4_public` module output
#### Addons
* Update CLUO from v0.6.0 to v0.7.0 ([#242](https://github.com/poseidon/typhoon/pull/242))
* Update Prometheus from v2.3.0 to v2.3.1
* Update Grafana from 5.1.3 to 5.1.4
* Drop `hostNetwork` from nginx-ingress addon
* Both flannel and Calico support host port via `portmap`
* Allows writing NetworkPolicies that reference ingress pods in `from` or `to`. HostNetwork pods were difficult to write network policy for since they could circumvent the CNI network to communicate with pods on the same node.
## v1.10.4
* Kubernetes [v1.10.4](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.10.md#v1104)
* Update etcd from v3.3.5 to v3.3.6
* Update Calico from v3.1.2 to v3.1.3
#### Addons
* Update Prometheus from v2.2.1 to v2.3.0
* Add Prometheus liveness and readiness probes
* Annotate Grafana service so Prometheus scrapes metrics
* Label namespaces to ease writing Network Policies
## v1.10.3
* Kubernetes [v1.10.3](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.10.md#v1103)

View File

@ -11,34 +11,38 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.10.3 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* Kubernetes v1.13.2 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/) and [preemption](https://typhoon.psdn.io/google-cloud/#preemption) (varies by platform)
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [preemptible](https://typhoon.psdn.io/cl/google-cloud/#preemption) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
* Ready for Ingress, Prometheus, Grafana, CSI, or other [addons](https://typhoon.psdn.io/addons/overview/)
## Modules
Typhoon provides a Terraform Module for each supported operating system and platform.
Typhoon provides a Terraform Module for each supported operating system and platform. Container Linux is a mature and reliable choice. Also, Kinvolk's Flatcar Linux fork is selectable on AWS and bare-metal.
| Platform | Operating System | Terraform Module | Status |
|---------------|------------------|------------------|--------|
| AWS | Container Linux | [aws/container-linux/kubernetes](aws/container-linux/kubernetes) | stable |
| AWS | Fedora Atomic | [aws/fedora-atomic/kubernetes](aws/fedora-atomic/kubernetes) | alpha |
| Azure | Container Linux | [azure/container-linux/kubernetes](cl/azure.md) | alpha |
| Bare-Metal | Container Linux | [bare-metal/container-linux/kubernetes](bare-metal/container-linux/kubernetes) | stable |
| Bare-Metal | Fedora Atomic | [bare-metal/fedora-atomic/kubernetes](bare-metal/fedora-atomic/kubernetes) | alpha |
| Digital Ocean | Container Linux | [digital-ocean/container-linux/kubernetes](digital-ocean/container-linux/kubernetes) | beta |
| Digital Ocean | Fedora Atomic | [digital-ocean/fedora-atomic/kubernetes](digital-ocean/fedora-atomic/kubernetes) | alpha |
| Google Cloud | Container Linux | [google-cloud/container-linux/kubernetes](google-cloud/container-linux/kubernetes) | beta |
| Google Cloud | Fedora Atomic | [google-cloud/fedora-atomic/kubernetes](google-cloud/fedora-atomic/kubernetes) | alpha |
| Google Cloud | Container Linux | [google-cloud/container-linux/kubernetes](google-cloud/container-linux/kubernetes) | stable |
The AWS and bare-metal `container-linux` modules allow picking Red Hat Container Linux (formerly CoreOS Container Linux) or Kinvolk's Flatcar Linux friendly fork.
Fedora Atomic support is alpha and will evolve as Fedora Atomic is replaced by Fedora CoreOS.
| Platform | Operating System | Terraform Module | Status |
|---------------|------------------|------------------|--------|
| AWS | Fedora Atomic | [aws/fedora-atomic/kubernetes](aws/fedora-atomic/kubernetes) | alpha |
| Bare-Metal | Fedora Atomic | [bare-metal/fedora-atomic/kubernetes](bare-metal/fedora-atomic/kubernetes) | alpha |
| Digital Ocean | Fedora Atomic | [digital-ocean/fedora-atomic/kubernetes](digital-ocean/fedora-atomic/kubernetes) | alpha |
| Google Cloud | Fedora Atomic | [google-cloud/fedora-atomic/kubernetes](google-cloud/fedora-atomic/kubernetes) | alpha |
## Documentation
* [Docs](https://typhoon.psdn.io)
* Architecture [concepts](https://typhoon.psdn.io/architecture/concepts/) and [operating systems](https://typhoon.psdn.io/architecture/operating-systems/)
* Tutorials for [AWS](https://typhoon.psdn.io/cl/aws/), [Bare-Metal](https://typhoon.psdn.io/cl/bare-metal/), [Digital Ocean](https://typhoon.psdn.io/cl/digital-ocean/), and [Google-Cloud](https://typhoon.psdn.io/cl/google-cloud/)
* Tutorials for [AWS](docs/cl/aws.md), [Azure](docs/cl/azure.md), [Bare-Metal](docs/cl/bare-metal.md), [Digital Ocean](docs/cl/digital-ocean.md), and [Google-Cloud](docs/cl/google-cloud.md)
## Usage
@ -46,7 +50,7 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo
```tf
module "google-cloud-yavin" {
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.10.3"
source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.13.2"
providers = {
google = "google.default"
@ -71,15 +75,14 @@ module "google-cloud-yavin" {
}
```
Fetch modules, plan the changes to be made, and apply the changes.
Initialize modules, plan the changes to be made, and apply the changes.
```sh
$ terraform init
$ terraform get --update
$ terraform plan
Plan: 37 to add, 0 to change, 0 to destroy.
Plan: 64 to add, 0 to change, 0 to destroy.
$ terraform apply
Apply complete! Resources: 37 added, 0 changed, 0 destroyed.
Apply complete! Resources: 64 added, 0 changed, 0 destroyed.
```
In 4-8 minutes (varies by platform), the cluster will be ready. This Google Cloud example creates a `yavin.example.com` DNS record to resolve to a network load balancer across controller nodes.
@ -87,10 +90,10 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou
```sh
$ export KUBECONFIG=/home/user/.secrets/clusters/yavin/auth/kubeconfig
$ kubectl get nodes
NAME STATUS AGE VERSION
yavin-controller-0.c.example-com.internal Ready 6m v1.10.3
yavin-worker-jrbf.c.example-com.internal Ready 5m v1.10.3
yavin-worker-mzdm.c.example-com.internal Ready 5m v1.10.3
NAME ROLES STATUS AGE VERSION
yavin-controller-0.c.example-com.internal controller,master Ready 6m v1.13.2
yavin-worker-jrbf.c.example-com.internal node Ready 5m v1.13.2
yavin-worker-mzdm.c.example-com.internal node Ready 5m v1.13.2
```
List the pods.
@ -101,16 +104,18 @@ NAMESPACE NAME READY STATUS RESTART
kube-system calico-node-1cs8z 2/2 Running 0 6m
kube-system calico-node-d1l5b 2/2 Running 0 6m
kube-system calico-node-sp9ps 2/2 Running 0 6m
kube-system coredns-1187388186-zj5dl 1/1 Running 0 6m
kube-system coredns-1187388186-dkh3o 1/1 Running 0 6m
kube-system kube-apiserver-zppls 1/1 Running 0 6m
kube-system kube-controller-manager-3271970485-gh9kt 1/1 Running 0 6m
kube-system kube-controller-manager-3271970485-h90v8 1/1 Running 1 6m
kube-system kube-dns-1187388186-zj5dl 3/3 Running 0 6m
kube-system kube-proxy-117v6 1/1 Running 0 6m
kube-system kube-proxy-9886n 1/1 Running 0 6m
kube-system kube-proxy-njn47 1/1 Running 0 6m
kube-system kube-scheduler-3895335239-5x87r 1/1 Running 0 6m
kube-system kube-scheduler-3895335239-bzrrt 1/1 Running 1 6m
kube-system pod-checkpointer-l6lrt 1/1 Running 0 6m
kube-system pod-checkpointer-l6lrt-controller-0 1/1 Running 0 6m
```
## Non-Goals

View File

@ -15,10 +15,12 @@ spec:
metadata:
labels:
app: container-linux-update-agent
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
containers:
- name: update-agent
image: quay.io/coreos/container-linux-update-operator:v0.6.0
image: quay.io/coreos/container-linux-update-operator:v0.7.0
command:
- "/bin/update-agent"
volumeMounts:

View File

@ -12,10 +12,12 @@ spec:
metadata:
labels:
app: container-linux-update-operator
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
containers:
- name: update-operator
image: quay.io/coreos/container-linux-update-operator:v0.6.0
image: quay.io/coreos/container-linux-update-operator:v0.7.0
command:
- "/bin/update-operator"
env:

View File

@ -18,15 +18,19 @@ spec:
labels:
name: grafana
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
containers:
- name: grafana
image: grafana/grafana:5.1.3
image: grafana/grafana:5.4.2
env:
- name: GF_SERVER_HTTP_PORT
value: "8080"
- name: GF_AUTH_BASIC_ENABLED
value: "false"
- name: GF_AUTH_DISABLE_LOGIN_FORM
value: "true"
- name: GF_AUTH_ANONYMOUS_ENABLED
value: "true"
- name: GF_AUTH_ANONYMOUS_ORG_ROLE

View File

@ -3,6 +3,9 @@ kind: Service
metadata:
name: grafana
namespace: monitoring
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '8080'
spec:
type: ClusterIP
selector:

View File

@ -5,7 +5,7 @@ metadata:
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:heapster
name: heapster
subjects:
- kind: ServiceAccount
name: heapster

View File

@ -0,0 +1,30 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: heapster
rules:
- apiGroups:
- ""
resources:
- events
- namespaces
- nodes
- pods
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- deployments
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- nodes/stats
verbs:
- get

View File

@ -14,14 +14,16 @@ spec:
labels:
name: heapster
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
serviceAccountName: heapster
containers:
- name: heapster
image: k8s.gcr.io/heapster-amd64:v1.5.2
image: k8s.gcr.io/heapster-amd64:v1.5.4
command:
- /heapster
- --source=kubernetes.summary_api:''
- --source=kubernetes.summary_api:''?useServiceAccount=true&kubeletHttps=true&kubeletPort=10250&insecure=true
livenessProbe:
httpGet:
path: /healthz

View File

@ -2,3 +2,5 @@ apiVersion: v1
kind: Namespace
metadata:
name: ingress
labels:
name: ingress

View File

@ -14,6 +14,8 @@ spec:
labels:
name: default-backend
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
containers:
- name: default-backend

View File

@ -17,13 +17,14 @@ spec:
labels:
name: nginx-ingress-controller
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
nodeSelector:
node-role.kubernetes.io/node: ""
hostNetwork: true
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.15.0
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.21.0
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-backend
@ -68,6 +69,11 @@ spec:
successThreshold: 1
timeoutSeconds: 1
securityContext:
runAsNonRoot: false
capabilities:
add:
- NET_BIND_SERVICE
drop:
- ALL
runAsUser: 33 # www-data
restartPolicy: Always
terminationGracePeriodSeconds: 60

View File

@ -0,0 +1,6 @@
apiVersion: v1
kind: Namespace
metadata:
name: ingress
labels:
name: ingress

View File

@ -0,0 +1,42 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: default-backend
namespace: ingress
spec:
replicas: 1
selector:
matchLabels:
name: default-backend
phase: prod
template:
metadata:
labels:
name: default-backend
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
containers:
- name: default-backend
# Any image is permissable as long as:
# 1. It serves a 404 page at /
# 2. It serves 200 on a /healthz endpoint
image: k8s.gcr.io/defaultbackend:1.4
ports:
- containerPort: 8080
resources:
limits:
cpu: 10m
memory: 20Mi
requests:
cpu: 10m
memory: 20Mi
livenessProbe:
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 30
timeoutSeconds: 5
terminationGracePeriodSeconds: 60

View File

@ -0,0 +1,15 @@
apiVersion: v1
kind: Service
metadata:
name: default-backend
namespace: ingress
spec:
type: ClusterIP
selector:
name: default-backend
phase: prod
ports:
- name: http
protocol: TCP
port: 80
targetPort: 8080

View File

@ -0,0 +1,79 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-ingress-controller
namespace: ingress
spec:
replicas: 2
strategy:
rollingUpdate:
maxUnavailable: 1
selector:
matchLabels:
name: nginx-ingress-controller
phase: prod
template:
metadata:
labels:
name: nginx-ingress-controller
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
nodeSelector:
node-role.kubernetes.io/node: ""
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.21.0
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-backend
- --ingress-class=public
# use downward API
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
ports:
- name: http
containerPort: 80
hostPort: 80
- name: https
containerPort: 443
hostPort: 443
- name: health
containerPort: 10254
hostPort: 10254
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
readinessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 10254
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
securityContext:
capabilities:
add:
- NET_BIND_SERVICE
drop:
- ALL
runAsUser: 33 # www-data
restartPolicy: Always
terminationGracePeriodSeconds: 60

View File

@ -0,0 +1,12 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: ingress
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: ingress
subjects:
- kind: ServiceAccount
namespace: ingress
name: default

View File

@ -0,0 +1,51 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: ingress
rules:
- apiGroups:
- ""
resources:
- configmaps
- endpoints
- nodes
- pods
- secrets
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- apiGroups:
- ""
resources:
- services
verbs:
- get
- list
- watch
- apiGroups:
- "extensions"
resources:
- ingresses
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- apiGroups:
- "extensions"
resources:
- ingresses/status
verbs:
- update

View File

@ -0,0 +1,13 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: ingress
namespace: ingress
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: ingress
subjects:
- kind: ServiceAccount
namespace: ingress
name: default

View File

@ -0,0 +1,41 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: ingress
namespace: ingress
rules:
- apiGroups:
- ""
resources:
- configmaps
- pods
- secrets
verbs:
- get
- apiGroups:
- ""
resources:
- configmaps
resourceNames:
# Defaults to "<election-id>-<ingress-class>"
# Here: "<ingress-controller-leader>-<nginx>"
# This has to be adapted if you change either parameter
# when launching the nginx-ingress-controller.
- "ingress-controller-leader-public"
verbs:
- get
- update
- apiGroups:
- ""
resources:
- configmaps
verbs:
- create
- apiGroups:
- ""
resources:
- endpoints
verbs:
- get
- create
- update

View File

@ -0,0 +1,22 @@
apiVersion: v1
kind: Service
metadata:
name: nginx-ingress-controller
namespace: ingress
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '10254'
spec:
type: ClusterIP
selector:
name: nginx-ingress-controller
phase: prod
ports:
- name: http
protocol: TCP
port: 80
targetPort: 80
- name: https
protocol: TCP
port: 443
targetPort: 443

View File

@ -0,0 +1,6 @@
apiVersion: v1
kind: Namespace
metadata:
name: ingress
labels:
name: ingress

View File

@ -0,0 +1,42 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: default-backend
namespace: ingress
spec:
replicas: 1
selector:
matchLabels:
name: default-backend
phase: prod
template:
metadata:
labels:
name: default-backend
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
containers:
- name: default-backend
# Any image is permissable as long as:
# 1. It serves a 404 page at /
# 2. It serves 200 on a /healthz endpoint
image: k8s.gcr.io/defaultbackend:1.4
ports:
- containerPort: 8080
resources:
limits:
cpu: 10m
memory: 20Mi
requests:
cpu: 10m
memory: 20Mi
livenessProbe:
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 30
timeoutSeconds: 5
terminationGracePeriodSeconds: 60

View File

@ -0,0 +1,15 @@
apiVersion: v1
kind: Service
metadata:
name: default-backend
namespace: ingress
spec:
type: ClusterIP
selector:
name: default-backend
phase: prod
ports:
- name: http
protocol: TCP
port: 80
targetPort: 8080

View File

@ -0,0 +1,75 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: ingress-controller-public
namespace: ingress
spec:
replicas: 2
strategy:
rollingUpdate:
maxUnavailable: 1
selector:
matchLabels:
name: ingress-controller-public
phase: prod
template:
metadata:
labels:
name: ingress-controller-public
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.21.0
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-backend
- --ingress-class=public
# use downward API
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
ports:
- name: http
containerPort: 80
- name: https
containerPort: 443
- name: health
containerPort: 10254
livenessProbe:
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
timeoutSeconds: 1
readinessProbe:
httpGet:
path: /healthz
port: 10254
scheme: HTTP
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
timeoutSeconds: 1
securityContext:
capabilities:
add:
- NET_BIND_SERVICE
drop:
- ALL
runAsUser: 33 # www-data
restartPolicy: Always
terminationGracePeriodSeconds: 60

View File

@ -0,0 +1,12 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: ingress
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: ingress
subjects:
- kind: ServiceAccount
namespace: ingress
name: default

View File

@ -0,0 +1,51 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: ingress
rules:
- apiGroups:
- ""
resources:
- configmaps
- endpoints
- nodes
- pods
- secrets
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- apiGroups:
- ""
resources:
- services
verbs:
- get
- list
- watch
- apiGroups:
- "extensions"
resources:
- ingresses
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- apiGroups:
- "extensions"
resources:
- ingresses/status
verbs:
- update

View File

@ -0,0 +1,13 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: ingress
namespace: ingress
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: ingress
subjects:
- kind: ServiceAccount
namespace: ingress
name: default

View File

@ -0,0 +1,41 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: ingress
namespace: ingress
rules:
- apiGroups:
- ""
resources:
- configmaps
- pods
- secrets
verbs:
- get
- apiGroups:
- ""
resources:
- configmaps
resourceNames:
# Defaults to "<election-id>-<ingress-class>"
# Here: "<ingress-controller-leader>-<nginx>"
# This has to be adapted if you change either parameter
# when launching the nginx-ingress-controller.
- "ingress-controller-leader-public"
verbs:
- get
- update
- apiGroups:
- ""
resources:
- configmaps
verbs:
- create
- apiGroups:
- ""
resources:
- endpoints
verbs:
- get
- create
- update

View File

@ -0,0 +1,23 @@
apiVersion: v1
kind: Service
metadata:
name: ingress-controller-public
namespace: ingress
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '10254'
spec:
type: ClusterIP
clusterIP: 10.3.0.12
selector:
name: ingress-controller-public
phase: prod
ports:
- name: http
protocol: TCP
port: 80
targetPort: 80
- name: https
protocol: TCP
port: 443
targetPort: 443

View File

@ -2,3 +2,5 @@ apiVersion: v1
kind: Namespace
metadata:
name: ingress
labels:
name: ingress

View File

@ -17,13 +17,14 @@ spec:
labels:
name: nginx-ingress-controller
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
nodeSelector:
node-role.kubernetes.io/node: ""
hostNetwork: true
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.15.0
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.21.0
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-backend
@ -68,6 +69,11 @@ spec:
successThreshold: 1
timeoutSeconds: 1
securityContext:
runAsNonRoot: false
capabilities:
add:
- NET_BIND_SERVICE
drop:
- ALL
runAsUser: 33 # www-data
restartPolicy: Always
terminationGracePeriodSeconds: 60

View File

@ -14,6 +14,8 @@ spec:
labels:
name: default-backend
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
containers:
- name: default-backend

View File

@ -2,3 +2,5 @@ apiVersion: v1
kind: Namespace
metadata:
name: ingress
labels:
name: ingress

View File

@ -14,6 +14,8 @@ spec:
labels:
name: default-backend
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
containers:
- name: default-backend

View File

@ -17,13 +17,14 @@ spec:
labels:
name: nginx-ingress-controller
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
nodeSelector:
node-role.kubernetes.io/node: ""
hostNetwork: true
containers:
- name: nginx-ingress-controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.15.0
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.21.0
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-backend
@ -68,6 +69,11 @@ spec:
successThreshold: 1
timeoutSeconds: 1
securityContext:
runAsNonRoot: false
capabilities:
add:
- NET_BIND_SERVICE
drop:
- ALL
runAsUser: 33 # www-data
restartPolicy: Always
terminationGracePeriodSeconds: 60

View File

@ -2,3 +2,5 @@ apiVersion: v1
kind: Namespace
metadata:
name: monitoring
labels:
name: monitoring

View File

@ -102,7 +102,7 @@ data:
regex: 'true'
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- source_labels: [__meta_kubernetes_node_name]
- source_labels: [__meta_kubernetes_node_address_InternalIP]
action: replace
target_label: __address__
replacement: '${1}:2381'

View File

@ -14,33 +14,46 @@ spec:
labels:
name: prometheus
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
serviceAccountName: prometheus
containers:
- name: prometheus
image: quay.io/prometheus/prometheus:v2.2.1
args:
- --config.file=/etc/prometheus/prometheus.yaml
- --storage.tsdb.path=/var/lib/prometheus
ports:
- name: web
containerPort: 9090
volumeMounts:
- name: config
mountPath: /etc/prometheus
- name: rules
mountPath: /etc/prometheus/rules
- name: data
mountPath: /var/lib/prometheus
dnsPolicy: ClusterFirst
restartPolicy: Always
- name: prometheus
image: quay.io/prometheus/prometheus:v2.6.0
args:
- --web.listen-address=0.0.0.0:9090
- --config.file=/etc/prometheus/prometheus.yaml
- --storage.tsdb.path=/var/lib/prometheus
ports:
- name: web
containerPort: 9090
volumeMounts:
- name: config
mountPath: /etc/prometheus
- name: rules
mountPath: /etc/prometheus/rules
- name: data
mountPath: /var/lib/prometheus
readinessProbe:
httpGet:
path: /-/ready
port: 9090
initialDelaySeconds: 10
timeoutSeconds: 10
livenessProbe:
httpGet:
path: /-/healthy
port: 9090
initialDelaySeconds: 10
timeoutSeconds: 10
terminationGracePeriodSeconds: 30
volumes:
- name: config
configMap:
name: prometheus-config
- name: rules
configMap:
name: prometheus-rules
- name: data
emptyDir: {}
- name: config
configMap:
name: prometheus-config
- name: rules
configMap:
name: prometheus-rules
- name: data
emptyDir: {}

View File

@ -18,11 +18,13 @@ spec:
labels:
name: kube-state-metrics
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
serviceAccountName: kube-state-metrics
containers:
- name: kube-state-metrics
image: quay.io/coreos/kube-state-metrics:v1.3.1
image: quay.io/coreos/kube-state-metrics:v1.4.0
ports:
- name: metrics
containerPort: 8080

View File

@ -17,6 +17,8 @@ spec:
labels:
name: node-exporter
phase: prod
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
serviceAccountName: node-exporter
securityContext:

View File

@ -299,7 +299,7 @@ data:
labels:
severity: warning
annotations:
description: Pod {{$labels.namespaces}}/{{$labels.pod}} is was restarted {{$value}}
description: Pod {{$labels.namespaces}}/{{$labels.pod}} restarted {{$value}}
times within the last hour
summary: Pod is restarting frequently
kubelet.rules.yaml: |
@ -496,6 +496,13 @@ data:
annotations:
description: device {{$labels.device}} on node {{$labels.instance}} is running
full within the next 2 hours (mounted at {{$labels.mountpoint}})
- alert: InactiveRAIDDisk
expr: node_md_disks - node_md_disks_active > 0
for: 10m
labels:
severity: warning
annotations:
description: '{{$value}} RAID disk(s) on node {{$labels.instance}} are inactive'
prometheus.rules.yaml: |
groups:
- name: prometheus.rules

View File

@ -11,10 +11,10 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.10.3 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* Kubernetes v1.13.2 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/cl/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
## Docs

View File

@ -1,6 +1,6 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=3fa3c2d73b57b2372c7c68e7db1cf82932ea1380"
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=bcbdddd8d07c99ab88b2e9ebfb662de4c104de0a"
cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
@ -11,4 +11,5 @@ module "bootkube" {
pod_cidr = "${var.pod_cidr}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
enable_reporting = "${var.enable_reporting}"
}

View File

@ -7,7 +7,7 @@ systemd:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.3.5"
Environment="ETCD_IMAGE_TAG=v3.3.10"
Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
@ -74,12 +74,11 @@ systemd:
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
--allow-privileged \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${k8s_dns_service_ip} \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
@ -89,6 +88,7 @@ systemd:
--node-labels=node-role.kubernetes.io/master \
--node-labels=node-role.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
@ -123,7 +123,7 @@ storage:
contents:
inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.10.3
KUBELET_IMAGE_TAG=v1.13.2
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -143,17 +143,14 @@ storage:
set -e
# Move experimental manifests
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.12.0}"
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
exec /usr/bin/rkt run \
--trust-keys-from-https \
--volume assets,kind=host,source=$${BOOTKUBE_ASSETS} \
--volume assets,kind=host,source=/opt/bootkube/assets \
--mount volume=assets,target=/assets \
--volume bootstrap,kind=host,source=/etc/kubernetes \
--mount volume=bootstrap,target=/etc/kubernetes \
$${RKT_OPTS} \
$${BOOTKUBE_ACI}:$${BOOTKUBE_VERSION} \
quay.io/coreos/bootkube:v0.14.0 \
--net=host \
--dns=host \
--exec=/bootkube -- start --asset-dir=/assets "$@"

View File

@ -24,12 +24,13 @@ resource "aws_instance" "controllers" {
instance_type = "${var.controller_type}"
ami = "${local.ami_id}"
user_data = "${element(data.ct_config.controller_ign.*.rendered, count.index)}"
user_data = "${element(data.ct_config.controller-ignitions.*.rendered, count.index)}"
# storage
root_block_device {
volume_type = "${var.disk_type}"
volume_size = "${var.disk_size}"
iops = "${var.disk_iops}"
}
# network
@ -38,12 +39,23 @@ resource "aws_instance" "controllers" {
vpc_security_group_ids = ["${aws_security_group.controller.id}"]
lifecycle {
ignore_changes = ["ami"]
ignore_changes = [
"ami",
"user_data",
]
}
}
# Controller Container Linux Config
data "template_file" "controller_config" {
# Controller Ignition configs
data "ct_config" "controller-ignitions" {
count = "${var.controller_count}"
content = "${element(data.template_file.controller-configs.*.rendered, count.index)}"
pretty_print = false
snippets = ["${var.controller_clc_snippets}"]
}
# Controller Container Linux configs
data "template_file" "controller-configs" {
count = "${var.controller_count}"
template = "${file("${path.module}/cl/controller.yaml.tmpl")}"
@ -54,29 +66,22 @@ data "template_file" "controller_config" {
etcd_domain = "${var.cluster_name}-etcd${count.index}.${var.dns_zone}"
# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...
etcd_initial_cluster = "${join(",", formatlist("%s=https://%s:2380", null_resource.repeat.*.triggers.name, null_resource.repeat.*.triggers.domain))}"
etcd_initial_cluster = "${join(",", data.template_file.etcds.*.rendered)}"
kubeconfig = "${indent(10, module.bootkube.kubeconfig)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
kubeconfig = "${indent(10, module.bootkube.kubeconfig-kubelet)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
cluster_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
}
}
# Horrible hack to generate a Terraform list of a desired length without dependencies.
# Ideal ${repeat("etcd", 3) -> ["etcd", "etcd", "etcd"]}
resource null_resource "repeat" {
count = "${var.controller_count}"
data "template_file" "etcds" {
count = "${var.controller_count}"
template = "etcd$${index}=https://$${cluster_name}-etcd$${index}.$${dns_zone}:2380"
triggers {
name = "etcd${count.index}"
domain = "${var.cluster_name}-etcd${count.index}.${var.dns_zone}"
vars {
index = "${count.index}"
cluster_name = "${var.cluster_name}"
dns_zone = "${var.dns_zone}"
}
}
data "ct_config" "controller_ign" {
count = "${var.controller_count}"
content = "${element(data.template_file.controller_config.*.rendered, count.index)}"
pretty_print = false
snippets = ["${var.controller_clc_snippets}"]
}

View File

@ -5,17 +5,17 @@ resource "aws_route53_record" "apiserver" {
name = "${format("%s.%s.", var.cluster_name, var.dns_zone)}"
type = "A"
# AWS recommends their special "alias" records for ELBs
# AWS recommends their special "alias" records for NLBs
alias {
name = "${aws_lb.apiserver.dns_name}"
zone_id = "${aws_lb.apiserver.zone_id}"
name = "${aws_lb.nlb.dns_name}"
zone_id = "${aws_lb.nlb.zone_id}"
evaluate_target_health = true
}
}
# Network Load Balancer for apiservers
resource "aws_lb" "apiserver" {
name = "${var.cluster_name}-apiserver"
# Network Load Balancer for apiservers and ingress
resource "aws_lb" "nlb" {
name = "${var.cluster_name}-nlb"
load_balancer_type = "network"
internal = false
@ -24,11 +24,11 @@ resource "aws_lb" "apiserver" {
enable_cross_zone_load_balancing = true
}
# Forward TCP traffic to controllers
# Forward TCP apiserver traffic to controllers
resource "aws_lb_listener" "apiserver-https" {
load_balancer_arn = "${aws_lb.apiserver.arn}"
load_balancer_arn = "${aws_lb.nlb.arn}"
protocol = "TCP"
port = "443"
port = "6443"
default_action {
type = "forward"
@ -36,6 +36,30 @@ resource "aws_lb_listener" "apiserver-https" {
}
}
# Forward HTTP ingress traffic to workers
resource "aws_lb_listener" "ingress-http" {
load_balancer_arn = "${aws_lb.nlb.arn}"
protocol = "TCP"
port = 80
default_action {
type = "forward"
target_group_arn = "${module.workers.target_group_http}"
}
}
# Forward HTTPS ingress traffic to workers
resource "aws_lb_listener" "ingress-https" {
load_balancer_arn = "${aws_lb.nlb.arn}"
protocol = "TCP"
port = 443
default_action {
type = "forward"
target_group_arn = "${module.workers.target_group_https}"
}
}
# Target group of controllers
resource "aws_lb_target_group" "controllers" {
name = "${var.cluster_name}-controllers"
@ -43,12 +67,12 @@ resource "aws_lb_target_group" "controllers" {
target_type = "instance"
protocol = "TCP"
port = 443
port = 6443
# TCP health check for apiserver
health_check {
protocol = "TCP"
port = 443
port = 6443
# NLBs required to use same healthy and unhealthy thresholds
healthy_threshold = 3
@ -65,5 +89,5 @@ resource "aws_lb_target_group_attachment" "controllers" {
target_group_arn = "${aws_lb_target_group.controllers.arn}"
target_id = "${element(aws_instance.controllers.*.id, count.index)}"
port = 443
port = 6443
}

View File

@ -1,5 +1,11 @@
output "kubeconfig-admin" {
value = "${module.bootkube.kubeconfig-admin}"
}
# Outputs for Kubernetes Ingress
output "ingress_dns_name" {
value = "${module.workers.ingress_dns_name}"
value = "${aws_lb.nlb.dns_name}"
description = "DNS name of the network load balancer for distributing traffic to Ingress controllers"
}
@ -21,5 +27,17 @@ output "worker_security_groups" {
}
output "kubeconfig" {
value = "${module.bootkube.kubeconfig}"
value = "${module.bootkube.kubeconfig-kubelet}"
}
# Outputs for custom load balancing
output "worker_target_group_http" {
description = "ARN of a target group of workers for HTTP traffic"
value = "${module.workers.target_group_http}"
}
output "worker_target_group_https" {
description = "ARN of a target group of workers for HTTPS traffic"
value = "${module.workers.target_group_https}"
}

View File

@ -11,16 +11,6 @@ resource "aws_security_group" "controller" {
tags = "${map("Name", "${var.cluster_name}-controller")}"
}
resource "aws_security_group_rule" "controller-icmp" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "icmp"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "controller-ssh" {
security_group_id = "${aws_security_group.controller.id}"
@ -31,16 +21,6 @@ resource "aws_security_group_rule" "controller-ssh" {
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "controller-apiserver" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 443
to_port = 443
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "controller-etcd" {
security_group_id = "${aws_security_group.controller.id}"
@ -51,6 +31,7 @@ resource "aws_security_group_rule" "controller-etcd" {
self = true
}
# Allow Prometheus to scrape etcd metrics
resource "aws_security_group_rule" "controller-etcd-metrics" {
security_group_id = "${aws_security_group.controller.id}"
@ -61,6 +42,16 @@ resource "aws_security_group_rule" "controller-etcd-metrics" {
source_security_group_id = "${aws_security_group.worker.id}"
}
resource "aws_security_group_rule" "controller-apiserver" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 6443
to_port = 6443
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "controller-flannel" {
security_group_id = "${aws_security_group.controller.id}"
@ -81,6 +72,7 @@ resource "aws_security_group_rule" "controller-flannel-self" {
self = true
}
# Allow Prometheus to scrape node-exporter daemonset
resource "aws_security_group_rule" "controller-node-exporter" {
security_group_id = "${aws_security_group.controller.id}"
@ -91,6 +83,7 @@ resource "aws_security_group_rule" "controller-node-exporter" {
source_security_group_id = "${aws_security_group.worker.id}"
}
# Allow apiserver to access kubelets for exec, log, port-forward
resource "aws_security_group_rule" "controller-kubelet" {
security_group_id = "${aws_security_group.controller.id}"
@ -111,26 +104,6 @@ resource "aws_security_group_rule" "controller-kubelet-self" {
self = true
}
resource "aws_security_group_rule" "controller-kubelet-read" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 10255
to_port = 10255
source_security_group_id = "${aws_security_group.worker.id}"
}
resource "aws_security_group_rule" "controller-kubelet-read-self" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 10255
to_port = 10255
self = true
}
resource "aws_security_group_rule" "controller-bgp" {
security_group_id = "${aws_security_group.controller.id}"
@ -213,16 +186,6 @@ resource "aws_security_group" "worker" {
tags = "${map("Name", "${var.cluster_name}-worker")}"
}
resource "aws_security_group_rule" "worker-icmp" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "icmp"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "worker-ssh" {
security_group_id = "${aws_security_group.worker.id}"
@ -273,6 +236,7 @@ resource "aws_security_group_rule" "worker-flannel-self" {
self = true
}
# Allow Prometheus to scrape node-exporter daemonset
resource "aws_security_group_rule" "worker-node-exporter" {
security_group_id = "${aws_security_group.worker.id}"
@ -293,6 +257,7 @@ resource "aws_security_group_rule" "ingress-health" {
cidr_blocks = ["0.0.0.0/0"]
}
# Allow apiserver to access kubelets for exec, log, port-forward
resource "aws_security_group_rule" "worker-kubelet" {
security_group_id = "${aws_security_group.worker.id}"
@ -303,6 +268,7 @@ resource "aws_security_group_rule" "worker-kubelet" {
source_security_group_id = "${aws_security_group.controller.id}"
}
# Allow Prometheus to scrape kubelet metrics
resource "aws_security_group_rule" "worker-kubelet-self" {
security_group_id = "${aws_security_group.worker.id}"
@ -313,26 +279,6 @@ resource "aws_security_group_rule" "worker-kubelet-self" {
self = true
}
resource "aws_security_group_rule" "worker-kubelet-read" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 10255
to_port = 10255
source_security_group_id = "${aws_security_group.controller.id}"
}
resource "aws_security_group_rule" "worker-kubelet-read-self" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 10255
to_port = 10255
self = true
}
resource "aws_security_group_rule" "worker-bgp" {
security_group_id = "${aws_security_group.worker.id}"

View File

@ -31,13 +31,13 @@ variable "worker_count" {
variable "controller_type" {
type = "string"
default = "t2.small"
default = "t3.small"
description = "EC2 instance type for controllers"
}
variable "worker_type" {
type = "string"
default = "t2.small"
default = "t3.small"
description = "EC2 instance type for workers"
}
@ -59,6 +59,12 @@ variable "disk_type" {
description = "Type of the EBS volume (e.g. standard, gp2, io1)"
}
variable "disk_iops" {
type = "string"
default = "0"
description = "IOPS of the EBS volume (e.g. 100)"
}
variable "worker_price" {
type = "string"
default = ""
@ -116,7 +122,7 @@ variable "pod_cidr" {
variable "service_cidr" {
description = <<EOD
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns.
EOD
type = "string"
@ -124,7 +130,13 @@ EOD
}
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by kube-dns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = "string"
default = "cluster.local"
}
variable "enable_reporting" {
type = "string"
description = "Enable usage or analytics reporting to upstreams (Calico)"
default = "false"
}

View File

@ -13,7 +13,7 @@ module "workers" {
spot_price = "${var.worker_price}"
# configuration
kubeconfig = "${module.bootkube.kubeconfig}"
kubeconfig = "${module.bootkube.kubeconfig-kubelet}"
ssh_authorized_key = "${var.ssh_authorized_key}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"

View File

@ -47,12 +47,11 @@ systemd:
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
--allow-privileged \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${k8s_dns_service_ip} \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
@ -61,6 +60,7 @@ systemd:
--network-plugin=cni \
--node-labels=node-role.kubernetes.io/node \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
@ -93,7 +93,7 @@ storage:
contents:
inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.10.3
KUBELET_IMAGE_TAG=v1.13.2
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
@ -111,7 +111,7 @@ storage:
--volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \
--insecure-options=image \
docker://k8s.gcr.io/hyperkube:v1.10.3 \
docker://k8s.gcr.io/hyperkube:v1.13.2 \
--net=host \
--dns=host \
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname)

View File

@ -1,39 +1,4 @@
# Network Load Balancer for Ingress
resource "aws_lb" "ingress" {
name = "${var.name}-ingress"
load_balancer_type = "network"
internal = false
subnets = ["${var.subnet_ids}"]
enable_cross_zone_load_balancing = true
}
# Forward HTTP traffic to workers
resource "aws_lb_listener" "ingress-http" {
load_balancer_arn = "${aws_lb.ingress.arn}"
protocol = "TCP"
port = 80
default_action {
type = "forward"
target_group_arn = "${aws_lb_target_group.workers-http.arn}"
}
}
# Forward HTTPS traffic to workers
resource "aws_lb_listener" "ingress-https" {
load_balancer_arn = "${aws_lb.ingress.arn}"
protocol = "TCP"
port = 443
default_action {
type = "forward"
target_group_arn = "${aws_lb_target_group.workers-https.arn}"
}
}
# Network Load Balancer target groups of instances
# Target groups of instances for use with load balancers
resource "aws_lb_target_group" "workers-http" {
name = "${var.name}-workers-http"

View File

@ -1,4 +1,9 @@
output "ingress_dns_name" {
value = "${aws_lb.ingress.dns_name}"
description = "DNS name of the network load balancer for distributing traffic to Ingress controllers"
output "target_group_http" {
description = "ARN of a target group of workers for HTTP traffic"
value = "${aws_lb_target_group.workers-http.arn}"
}
output "target_group_https" {
description = "ARN of a target group of workers for HTTPS traffic"
value = "${aws_lb_target_group.workers-https.arn}"
}

View File

@ -30,7 +30,7 @@ variable "count" {
variable "instance_type" {
type = "string"
default = "t2.small"
default = "t3.small"
description = "EC2 instance type"
}
@ -52,6 +52,12 @@ variable "disk_type" {
description = "Type of the EBS volume (e.g. standard, gp2, io1)"
}
variable "disk_iops" {
type = "string"
default = "0"
description = "IOPS of the EBS volume (required for io1)"
}
variable "spot_price" {
type = "string"
default = ""
@ -79,7 +85,7 @@ variable "ssh_authorized_key" {
variable "service_cidr" {
description = <<EOD
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns.
EOD
type = "string"
@ -87,7 +93,7 @@ EOD
}
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by kube-dns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = "string"
default = "cluster.local"
}

View File

@ -41,16 +41,18 @@ resource "aws_autoscaling_group" "workers" {
# Worker template
resource "aws_launch_configuration" "worker" {
image_id = "${local.ami_id}"
instance_type = "${var.instance_type}"
spot_price = "${var.spot_price}"
image_id = "${local.ami_id}"
instance_type = "${var.instance_type}"
spot_price = "${var.spot_price}"
enable_monitoring = false
user_data = "${data.ct_config.worker_ign.rendered}"
user_data = "${data.ct_config.worker-ignition.rendered}"
# storage
root_block_device {
volume_type = "${var.disk_type}"
volume_size = "${var.disk_size}"
iops = "${var.disk_iops}"
}
# network
@ -63,20 +65,21 @@ resource "aws_launch_configuration" "worker" {
}
}
# Worker Container Linux Config
data "template_file" "worker_config" {
template = "${file("${path.module}/cl/worker.yaml.tmpl")}"
vars = {
kubeconfig = "${indent(10, var.kubeconfig)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
}
}
data "ct_config" "worker_ign" {
content = "${data.template_file.worker_config.rendered}"
# Worker Ignition config
data "ct_config" "worker-ignition" {
content = "${data.template_file.worker-config.rendered}"
pretty_print = false
snippets = ["${var.clc_snippets}"]
}
# Worker Container Linux config
data "template_file" "worker-config" {
template = "${file("${path.module}/cl/worker.yaml.tmpl")}"
vars = {
kubeconfig = "${indent(10, var.kubeconfig)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
cluster_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
}
}

View File

@ -11,10 +11,10 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.10.3 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* Kubernetes v1.13.2 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/)
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/) and [spot](https://typhoon.psdn.io/cl/aws/#spot) workers
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
## Docs

View File

@ -14,6 +14,6 @@ data "aws_ami" "fedora" {
filter {
name = "name"
values = ["Fedora-Atomic-27-20180419.0.x86_64-*-gp2-*"]
values = ["Fedora-AtomicHost-28-20180625.1.x86_64-*-gp2-*"]
}
}

View File

@ -1,6 +1,6 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=3fa3c2d73b57b2372c7c68e7db1cf82932ea1380"
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=bcbdddd8d07c99ab88b2e9ebfb662de4c104de0a"
cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
@ -11,6 +11,7 @@ module "bootkube" {
pod_cidr = "${var.pod_cidr}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
enable_reporting = "${var.enable_reporting}"
# Fedora
trusted_certs_dir = "/etc/pki/tls/certs"

View File

@ -19,24 +19,9 @@ write_files:
ETCD_PEER_CERT_FILE=/etc/ssl/certs/etcd/peer.crt
ETCD_PEER_KEY_FILE=/etc/ssl/certs/etcd/peer.key
ETCD_PEER_CLIENT_CERT_AUTH=true
- path: /etc/systemd/system/cloud-metadata.service
content: |
[Unit]
Description=Cloud metadata agent
[Service]
Type=oneshot
Environment=OUTPUT=/run/metadata/cloud
ExecStart=/usr/bin/mkdir -p /run/metadata
ExecStart=/usr/bin/bash -c 'echo "HOSTNAME_OVERRIDE=$(curl\
--url http://169.254.169.254/latest/meta-data/local-ipv4\
--retry 10)" > $${OUTPUT}'
[Install]
WantedBy=multi-user.target
- path: /etc/systemd/system/kubelet.service.d/10-typhoon.conf
content: |
[Unit]
Requires=cloud-metadata.service
After=cloud-metadata.service
Wants=rpc-statd.service
[Service]
ExecStartPre=/bin/mkdir -p /opt/cni/bin
@ -51,12 +36,11 @@ write_files:
RestartSec=10
- path: /etc/kubernetes/kubelet.conf
content: |
ARGS="--allow-privileged \
--anonymous-auth=false \
ARGS="--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${k8s_dns_service_ip} \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
@ -66,6 +50,7 @@ write_files:
--node-labels=node-role.kubernetes.io/master \
--node-labels=node-role.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins"
- path: /etc/kubernetes/kubeconfig
@ -93,11 +78,10 @@ bootcmd:
runcmd:
- [systemctl, daemon-reload]
- [systemctl, restart, NetworkManager]
- "atomic install --system --name=etcd quay.io/poseidon/etcd:v3.3.5"
- "atomic install --system --name=kubelet quay.io/poseidon/kubelet:v1.10.3"
- "atomic install --system --name=bootkube quay.io/poseidon/bootkube:v0.12.0"
- "atomic install --system --name=etcd quay.io/poseidon/etcd:v3.3.10"
- "atomic install --system --name=kubelet quay.io/poseidon/kubelet:v1.13.2"
- "atomic install --system --name=bootkube quay.io/poseidon/bootkube:v0.14.0"
- [systemctl, start, --no-block, etcd.service]
- [systemctl, enable, cloud-metadata.service]
- [systemctl, start, --no-block, kubelet.service]
users:
- default

View File

@ -30,6 +30,7 @@ resource "aws_instance" "controllers" {
root_block_device {
volume_type = "${var.disk_type}"
volume_size = "${var.disk_size}"
iops = "${var.disk_iops}"
}
# network
@ -38,7 +39,10 @@ resource "aws_instance" "controllers" {
vpc_security_group_ids = ["${aws_security_group.controller.id}"]
lifecycle {
ignore_changes = ["ami"]
ignore_changes = [
"ami",
"user_data",
]
}
}
@ -54,22 +58,22 @@ data "template_file" "controller-cloudinit" {
etcd_domain = "${var.cluster_name}-etcd${count.index}.${var.dns_zone}"
# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...
etcd_initial_cluster = "${join(",", formatlist("%s=https://%s:2380", null_resource.repeat.*.triggers.name, null_resource.repeat.*.triggers.domain))}"
etcd_initial_cluster = "${join(",", data.template_file.etcds.*.rendered)}"
kubeconfig = "${indent(6, module.bootkube.kubeconfig)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
kubeconfig = "${indent(6, module.bootkube.kubeconfig-kubelet)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
cluster_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
}
}
# Horrible hack to generate a Terraform list of a desired length without dependencies.
# Ideal ${repeat("etcd", 3) -> ["etcd", "etcd", "etcd"]}
resource null_resource "repeat" {
count = "${var.controller_count}"
data "template_file" "etcds" {
count = "${var.controller_count}"
template = "etcd$${index}=https://$${cluster_name}-etcd$${index}.$${dns_zone}:2380"
triggers {
name = "etcd${count.index}"
domain = "${var.cluster_name}-etcd${count.index}.${var.dns_zone}"
vars {
index = "${count.index}"
cluster_name = "${var.cluster_name}"
dns_zone = "${var.dns_zone}"
}
}

View File

@ -5,17 +5,17 @@ resource "aws_route53_record" "apiserver" {
name = "${format("%s.%s.", var.cluster_name, var.dns_zone)}"
type = "A"
# AWS recommends their special "alias" records for ELBs
# AWS recommends their special "alias" records for NLBs
alias {
name = "${aws_lb.apiserver.dns_name}"
zone_id = "${aws_lb.apiserver.zone_id}"
name = "${aws_lb.nlb.dns_name}"
zone_id = "${aws_lb.nlb.zone_id}"
evaluate_target_health = true
}
}
# Network Load Balancer for apiservers
resource "aws_lb" "apiserver" {
name = "${var.cluster_name}-apiserver"
# Network Load Balancer for apiservers and ingress
resource "aws_lb" "nlb" {
name = "${var.cluster_name}-nlb"
load_balancer_type = "network"
internal = false
@ -24,11 +24,11 @@ resource "aws_lb" "apiserver" {
enable_cross_zone_load_balancing = true
}
# Forward TCP traffic to controllers
# Forward TCP apiserver traffic to controllers
resource "aws_lb_listener" "apiserver-https" {
load_balancer_arn = "${aws_lb.apiserver.arn}"
load_balancer_arn = "${aws_lb.nlb.arn}"
protocol = "TCP"
port = "443"
port = "6443"
default_action {
type = "forward"
@ -36,6 +36,30 @@ resource "aws_lb_listener" "apiserver-https" {
}
}
# Forward HTTP ingress traffic to workers
resource "aws_lb_listener" "ingress-http" {
load_balancer_arn = "${aws_lb.nlb.arn}"
protocol = "TCP"
port = 80
default_action {
type = "forward"
target_group_arn = "${module.workers.target_group_http}"
}
}
# Forward HTTPS ingress traffic to workers
resource "aws_lb_listener" "ingress-https" {
load_balancer_arn = "${aws_lb.nlb.arn}"
protocol = "TCP"
port = 443
default_action {
type = "forward"
target_group_arn = "${module.workers.target_group_https}"
}
}
# Target group of controllers
resource "aws_lb_target_group" "controllers" {
name = "${var.cluster_name}-controllers"
@ -43,12 +67,12 @@ resource "aws_lb_target_group" "controllers" {
target_type = "instance"
protocol = "TCP"
port = 443
port = 6443
# TCP health check for apiserver
health_check {
protocol = "TCP"
port = 443
port = 6443
# NLBs required to use same healthy and unhealthy thresholds
healthy_threshold = 3
@ -65,5 +89,5 @@ resource "aws_lb_target_group_attachment" "controllers" {
target_group_arn = "${aws_lb_target_group.controllers.arn}"
target_id = "${element(aws_instance.controllers.*.id, count.index)}"
port = 443
port = 6443
}

View File

@ -1,5 +1,11 @@
output "kubeconfig-admin" {
value = "${module.bootkube.kubeconfig-admin}"
}
# Outputs for Kubernetes Ingress
output "ingress_dns_name" {
value = "${module.workers.ingress_dns_name}"
value = "${aws_lb.nlb.dns_name}"
description = "DNS name of the network load balancer for distributing traffic to Ingress controllers"
}
@ -21,5 +27,17 @@ output "worker_security_groups" {
}
output "kubeconfig" {
value = "${module.bootkube.kubeconfig}"
value = "${module.bootkube.kubeconfig-kubelet}"
}
# Outputs for custom load balancing
output "worker_target_group_http" {
description = "ARN of a target group of workers for HTTP traffic"
value = "${module.workers.target_group_http}"
}
output "worker_target_group_https" {
description = "ARN of a target group of workers for HTTPS traffic"
value = "${module.workers.target_group_https}"
}

View File

@ -11,16 +11,6 @@ resource "aws_security_group" "controller" {
tags = "${map("Name", "${var.cluster_name}-controller")}"
}
resource "aws_security_group_rule" "controller-icmp" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "icmp"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "controller-ssh" {
security_group_id = "${aws_security_group.controller.id}"
@ -31,16 +21,6 @@ resource "aws_security_group_rule" "controller-ssh" {
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "controller-apiserver" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 443
to_port = 443
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "controller-etcd" {
security_group_id = "${aws_security_group.controller.id}"
@ -51,6 +31,7 @@ resource "aws_security_group_rule" "controller-etcd" {
self = true
}
# Allow Prometheus to scrape etcd metrics
resource "aws_security_group_rule" "controller-etcd-metrics" {
security_group_id = "${aws_security_group.controller.id}"
@ -61,6 +42,16 @@ resource "aws_security_group_rule" "controller-etcd-metrics" {
source_security_group_id = "${aws_security_group.worker.id}"
}
resource "aws_security_group_rule" "controller-apiserver" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 6443
to_port = 6443
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "controller-flannel" {
security_group_id = "${aws_security_group.controller.id}"
@ -81,6 +72,7 @@ resource "aws_security_group_rule" "controller-flannel-self" {
self = true
}
# Allow Prometheus to scrape node-exporter daemonset
resource "aws_security_group_rule" "controller-node-exporter" {
security_group_id = "${aws_security_group.controller.id}"
@ -91,6 +83,7 @@ resource "aws_security_group_rule" "controller-node-exporter" {
source_security_group_id = "${aws_security_group.worker.id}"
}
# Allow apiserver to access kubelets for exec, log, port-forward
resource "aws_security_group_rule" "controller-kubelet" {
security_group_id = "${aws_security_group.controller.id}"
@ -111,26 +104,6 @@ resource "aws_security_group_rule" "controller-kubelet-self" {
self = true
}
resource "aws_security_group_rule" "controller-kubelet-read" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 10255
to_port = 10255
source_security_group_id = "${aws_security_group.worker.id}"
}
resource "aws_security_group_rule" "controller-kubelet-read-self" {
security_group_id = "${aws_security_group.controller.id}"
type = "ingress"
protocol = "tcp"
from_port = 10255
to_port = 10255
self = true
}
resource "aws_security_group_rule" "controller-bgp" {
security_group_id = "${aws_security_group.controller.id}"
@ -213,16 +186,6 @@ resource "aws_security_group" "worker" {
tags = "${map("Name", "${var.cluster_name}-worker")}"
}
resource "aws_security_group_rule" "worker-icmp" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "icmp"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "worker-ssh" {
security_group_id = "${aws_security_group.worker.id}"
@ -273,6 +236,7 @@ resource "aws_security_group_rule" "worker-flannel-self" {
self = true
}
# Allow Prometheus to scrape node-exporter daemonset
resource "aws_security_group_rule" "worker-node-exporter" {
security_group_id = "${aws_security_group.worker.id}"
@ -293,6 +257,7 @@ resource "aws_security_group_rule" "ingress-health" {
cidr_blocks = ["0.0.0.0/0"]
}
# Allow apiserver to access kubelets for exec, log, port-forward
resource "aws_security_group_rule" "worker-kubelet" {
security_group_id = "${aws_security_group.worker.id}"
@ -303,6 +268,7 @@ resource "aws_security_group_rule" "worker-kubelet" {
source_security_group_id = "${aws_security_group.controller.id}"
}
# Allow Prometheus to scrape kubelet metrics
resource "aws_security_group_rule" "worker-kubelet-self" {
security_group_id = "${aws_security_group.worker.id}"
@ -313,26 +279,6 @@ resource "aws_security_group_rule" "worker-kubelet-self" {
self = true
}
resource "aws_security_group_rule" "worker-kubelet-read" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 10255
to_port = 10255
source_security_group_id = "${aws_security_group.controller.id}"
}
resource "aws_security_group_rule" "worker-kubelet-read-self" {
security_group_id = "${aws_security_group.worker.id}"
type = "ingress"
protocol = "tcp"
from_port = 10255
to_port = 10255
self = true
}
resource "aws_security_group_rule" "worker-bgp" {
security_group_id = "${aws_security_group.worker.id}"

View File

@ -31,13 +31,13 @@ variable "worker_count" {
variable "controller_type" {
type = "string"
default = "t2.small"
default = "t3.small"
description = "EC2 instance type for controllers"
}
variable "worker_type" {
type = "string"
default = "t2.small"
default = "t3.small"
description = "EC2 instance type for workers"
}
@ -53,6 +53,12 @@ variable "disk_type" {
description = "Type of the EBS volume (e.g. standard, gp2, io1)"
}
variable "disk_iops" {
type = "string"
default = "0"
description = "IOPS of the EBS volume (e.g. 100)"
}
variable "worker_price" {
type = "string"
default = ""
@ -98,7 +104,7 @@ variable "pod_cidr" {
variable "service_cidr" {
description = <<EOD
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns.
EOD
type = "string"
@ -106,7 +112,13 @@ EOD
}
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by kube-dns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = "string"
default = "cluster.local"
}
variable "enable_reporting" {
type = "string"
description = "Enable usage or analytics reporting to upstreams (Calico)"
default = "false"
}

View File

@ -12,7 +12,7 @@ module "workers" {
spot_price = "${var.worker_price}"
# configuration
kubeconfig = "${module.bootkube.kubeconfig}"
kubeconfig = "${module.bootkube.kubeconfig-kubelet}"
ssh_authorized_key = "${var.ssh_authorized_key}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"

View File

@ -14,6 +14,6 @@ data "aws_ami" "fedora" {
filter {
name = "name"
values = ["Fedora-Atomic-27-20180419.0.x86_64-*-gp2-*"]
values = ["Fedora-AtomicHost-28-20180625.1.x86_64-*-gp2-*"]
}
}

View File

@ -1,23 +1,8 @@
#cloud-config
write_files:
- path: /etc/systemd/system/cloud-metadata.service
content: |
[Unit]
Description=Cloud metadata agent
[Service]
Type=oneshot
Environment=OUTPUT=/run/metadata/cloud
ExecStart=/usr/bin/mkdir -p /run/metadata
ExecStart=/usr/bin/bash -c 'echo "HOSTNAME_OVERRIDE=$(curl\
--url http://169.254.169.254/latest/meta-data/local-ipv4\
--retry 10)" > $${OUTPUT}'
[Install]
WantedBy=multi-user.target
- path: /etc/systemd/system/kubelet.service.d/10-typhoon.conf
content: |
[Unit]
Requires=cloud-metadata.service
After=cloud-metadata.service
Wants=rpc-statd.service
[Service]
ExecStartPre=/bin/mkdir -p /opt/cni/bin
@ -30,12 +15,11 @@ write_files:
RestartSec=10
- path: /etc/kubernetes/kubelet.conf
content: |
ARGS="--allow-privileged \
--anonymous-auth=false \
ARGS="--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${k8s_dns_service_ip} \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
@ -44,6 +28,7 @@ write_files:
--network-plugin=cni \
--node-labels=node-role.kubernetes.io/node \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins"
- path: /etc/kubernetes/kubeconfig
permissions: '0644'
@ -69,8 +54,7 @@ bootcmd:
runcmd:
- [systemctl, daemon-reload]
- [systemctl, restart, NetworkManager]
- [systemctl, enable, cloud-metadata.service]
- "atomic install --system --name=kubelet quay.io/poseidon/kubelet:v1.10.3"
- "atomic install --system --name=kubelet quay.io/poseidon/kubelet:v1.13.2"
- [systemctl, start, --no-block, kubelet.service]
users:
- default

View File

@ -1,39 +1,4 @@
# Network Load Balancer for Ingress
resource "aws_lb" "ingress" {
name = "${var.name}-ingress"
load_balancer_type = "network"
internal = false
subnets = ["${var.subnet_ids}"]
enable_cross_zone_load_balancing = true
}
# Forward HTTP traffic to workers
resource "aws_lb_listener" "ingress-http" {
load_balancer_arn = "${aws_lb.ingress.arn}"
protocol = "TCP"
port = 80
default_action {
type = "forward"
target_group_arn = "${aws_lb_target_group.workers-http.arn}"
}
}
# Forward HTTPS traffic to workers
resource "aws_lb_listener" "ingress-https" {
load_balancer_arn = "${aws_lb.ingress.arn}"
protocol = "TCP"
port = 443
default_action {
type = "forward"
target_group_arn = "${aws_lb_target_group.workers-https.arn}"
}
}
# Network Load Balancer target groups of instances
# Target groups of instances for use with load balancers
resource "aws_lb_target_group" "workers-http" {
name = "${var.name}-workers-http"

View File

@ -1,4 +1,9 @@
output "ingress_dns_name" {
value = "${aws_lb.ingress.dns_name}"
description = "DNS name of the network load balancer for distributing traffic to Ingress controllers"
output "target_group_http" {
description = "ARN of a target group of workers for HTTP traffic"
value = "${aws_lb_target_group.workers-http.arn}"
}
output "target_group_https" {
description = "ARN of a target group of workers for HTTPS traffic"
value = "${aws_lb_target_group.workers-https.arn}"
}

View File

@ -30,7 +30,7 @@ variable "count" {
variable "instance_type" {
type = "string"
default = "t2.small"
default = "t3.small"
description = "EC2 instance type"
}
@ -46,6 +46,12 @@ variable "disk_type" {
description = "Type of the EBS volume (e.g. standard, gp2, io1)"
}
variable "disk_iops" {
type = "string"
default = "0"
description = "IOPS of the EBS volume (required for io1)"
}
variable "spot_price" {
type = "string"
default = ""
@ -67,7 +73,7 @@ variable "ssh_authorized_key" {
variable "service_cidr" {
description = <<EOD
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns.
EOD
type = "string"
@ -75,7 +81,7 @@ EOD
}
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by kube-dns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = "string"
default = "cluster.local"
}

View File

@ -41,9 +41,10 @@ resource "aws_autoscaling_group" "workers" {
# Worker template
resource "aws_launch_configuration" "worker" {
image_id = "${data.aws_ami.fedora.image_id}"
instance_type = "${var.instance_type}"
spot_price = "${var.spot_price}"
image_id = "${data.aws_ami.fedora.image_id}"
instance_type = "${var.instance_type}"
spot_price = "${var.spot_price}"
enable_monitoring = false
user_data = "${data.template_file.worker-cloudinit.rendered}"
@ -51,6 +52,7 @@ resource "aws_launch_configuration" "worker" {
root_block_device {
volume_type = "${var.disk_type}"
volume_size = "${var.disk_size}"
iops = "${var.disk_iops}"
}
# network
@ -68,9 +70,9 @@ data "template_file" "worker-cloudinit" {
template = "${file("${path.module}/cloudinit/worker.yaml.tmpl")}"
vars = {
kubeconfig = "${indent(6, var.kubeconfig)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
k8s_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
kubeconfig = "${indent(6, var.kubeconfig)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
cluster_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
}
}

View File

@ -0,0 +1,23 @@
The MIT License (MIT)
Copyright (c) 2017 Typhoon Authors
Copyright (c) 2017 Dalton Hubble
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

View File

@ -0,0 +1,23 @@
# Typhoon <img align="right" src="https://storage.googleapis.com/poseidon/typhoon-logo.png">
Typhoon is a minimal and free Kubernetes distribution.
* Minimal, stable base Kubernetes distribution
* Declarative infrastructure and configuration
* Free (freedom and cost) and privacy-respecting
* Practical for labs, datacenters, and clouds
Typhoon distributes upstream Kubernetes, architectural conventions, and cluster addons, much like a GNU/Linux distribution provides the Linux kernel and userspace components.
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.13.2 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled
* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [low-priority](https://typhoon.psdn.io/cl/azure/#low-priority) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
## Docs
Please see the [official docs](https://typhoon.psdn.io) and the Azure [tutorial](https://typhoon.psdn.io/cl/azure/).

View File

@ -0,0 +1,14 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=bcbdddd8d07c99ab88b2e9ebfb662de4c104de0a"
cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
etcd_servers = ["${formatlist("%s.%s", azurerm_dns_a_record.etcds.*.name, var.dns_zone)}"]
asset_dir = "${var.asset_dir}"
networking = "flannel"
pod_cidr = "${var.pod_cidr}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
enable_reporting = "${var.enable_reporting}"
}

View File

@ -0,0 +1,161 @@
---
systemd:
units:
- name: etcd-member.service
enable: true
dropins:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.3.10"
Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${etcd_domain}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${etcd_domain}:2380"
Environment="ETCD_LISTEN_CLIENT_URLS=https://0.0.0.0:2379"
Environment="ETCD_LISTEN_PEER_URLS=https://0.0.0.0:2380"
Environment="ETCD_LISTEN_METRICS_URLS=http://0.0.0.0:2381"
Environment="ETCD_INITIAL_CLUSTER=${etcd_initial_cluster}"
Environment="ETCD_STRICT_RECONFIG_CHECK=true"
Environment="ETCD_SSL_DIR=/etc/ssl/etcd"
Environment="ETCD_TRUSTED_CA_FILE=/etc/ssl/certs/etcd/server-ca.crt"
Environment="ETCD_CERT_FILE=/etc/ssl/certs/etcd/server.crt"
Environment="ETCD_KEY_FILE=/etc/ssl/certs/etcd/server.key"
Environment="ETCD_CLIENT_CERT_AUTH=true"
Environment="ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/certs/etcd/peer-ca.crt"
Environment="ETCD_PEER_CERT_FILE=/etc/ssl/certs/etcd/peer.crt"
Environment="ETCD_PEER_KEY_FILE=/etc/ssl/certs/etcd/peer.key"
Environment="ETCD_PEER_CLIENT_CERT_AUTH=true"
- name: docker.service
enable: true
- name: locksmithd.service
mask: true
- name: wait-for-dns.service
enable: true
contents: |
[Unit]
Description=Wait for DNS entries
Wants=systemd-resolved.service
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/sh -c 'while ! /usr/bin/grep '^[^#[:space:]]' /etc/resolv.conf > /dev/null; do sleep 1; done'
[Install]
RequiredBy=kubelet.service
RequiredBy=etcd-member.service
- name: kubelet.service
enable: true
contents: |
[Unit]
Description=Kubelet via Hyperkube
Wants=rpc-statd.service
[Service]
EnvironmentFile=/etc/kubernetes/kubelet.env
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/cache/kubelet-pod.uuid \
--volume=resolv,kind=host,source=/etc/resolv.conf \
--mount volume=resolv,target=/etc/resolv.conf \
--volume var-lib-cni,kind=host,source=/var/lib/cni \
--mount volume=var-lib-cni,target=/var/lib/cni \
--volume var-lib-calico,kind=host,source=/var/lib/calico \
--mount volume=var-lib-calico,target=/var/lib/calico \
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
--mount volume=opt-cni-bin,target=/opt/cni/bin \
--volume var-log,kind=host,source=/var/log \
--mount volume=var-log,target=/var/log \
--insecure-options=image"
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /etc/kubernetes/checkpoint-secrets
ExecStartPre=/bin/mkdir -p /etc/kubernetes/inactive-manifests
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/mkdir -p /var/lib/calico
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni \
--node-labels=node-role.kubernetes.io/master \
--node-labels=node-role.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
- name: bootkube.service
contents: |
[Unit]
Description=Bootstrap a Kubernetes cluster
ConditionPathExists=!/opt/bootkube/init_bootkube.done
[Service]
Type=oneshot
RemainAfterExit=true
WorkingDirectory=/opt/bootkube
ExecStart=/opt/bootkube/bootkube-start
ExecStartPost=/bin/touch /opt/bootkube/init_bootkube.done
[Install]
WantedBy=multi-user.target
storage:
files:
- path: /etc/kubernetes/kubeconfig
filesystem: root
mode: 0644
contents:
inline: |
${kubeconfig}
- path: /etc/kubernetes/kubelet.env
filesystem: root
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.13.2
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
inline: |
fs.inotify.max_user_watches=16184
- path: /opt/bootkube/bootkube-start
filesystem: root
mode: 0544
user:
id: 500
group:
id: 500
contents:
inline: |
#!/bin/bash
# Wrapper for bootkube start
set -e
# Move experimental manifests
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
exec /usr/bin/rkt run \
--trust-keys-from-https \
--volume assets,kind=host,source=/opt/bootkube/assets \
--mount volume=assets,target=/assets \
--volume bootstrap,kind=host,source=/etc/kubernetes \
--mount volume=bootstrap,target=/etc/kubernetes \
$${RKT_OPTS} \
quay.io/coreos/bootkube:v0.14.0 \
--net=host \
--dns=host \
--exec=/bootkube -- start --asset-dir=/assets "$@"
passwd:
users:
- name: core
ssh_authorized_keys:
- "${ssh_authorized_key}"

View File

@ -0,0 +1,168 @@
# Discrete DNS records for each controller's private IPv4 for etcd usage
resource "azurerm_dns_a_record" "etcds" {
count = "${var.controller_count}"
resource_group_name = "${var.dns_zone_group}"
# DNS Zone name where record should be created
zone_name = "${var.dns_zone}"
# DNS record
name = "${format("%s-etcd%d", var.cluster_name, count.index)}"
ttl = 300
# private IPv4 address for etcd
records = ["${element(azurerm_network_interface.controllers.*.private_ip_address, count.index)}"]
}
locals {
# Channel for a Container Linux derivative
# coreos-stable -> Container Linux Stable
channel = "${element(split("-", var.os_image), 1)}"
}
# Controller availability set to spread controllers
resource "azurerm_availability_set" "controllers" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "${var.cluster_name}-controllers"
location = "${var.region}"
platform_fault_domain_count = 2
platform_update_domain_count = 4
managed = true
}
# Controller instances
resource "azurerm_virtual_machine" "controllers" {
count = "${var.controller_count}"
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "${var.cluster_name}-controller-${count.index}"
location = "${var.region}"
availability_set_id = "${azurerm_availability_set.controllers.id}"
vm_size = "${var.controller_type}"
# boot
storage_image_reference {
publisher = "CoreOS"
offer = "CoreOS"
sku = "${local.channel}"
version = "latest"
}
# storage
storage_os_disk {
name = "${var.cluster_name}-controller-${count.index}"
create_option = "FromImage"
caching = "ReadWrite"
disk_size_gb = "${var.disk_size}"
os_type = "Linux"
managed_disk_type = "Premium_LRS"
}
# network
network_interface_ids = ["${element(azurerm_network_interface.controllers.*.id, count.index)}"]
os_profile {
computer_name = "${var.cluster_name}-controller-${count.index}"
admin_username = "core"
custom_data = "${element(data.ct_config.controller-ignitions.*.rendered, count.index)}"
}
# Azure mandates setting an ssh_key, even though Ignition custom_data handles it too
os_profile_linux_config {
disable_password_authentication = true
ssh_keys {
path = "/home/core/.ssh/authorized_keys"
key_data = "${var.ssh_authorized_key}"
}
}
# lifecycle
delete_os_disk_on_termination = true
delete_data_disks_on_termination = true
lifecycle {
ignore_changes = [
"storage_os_disk",
"os_profile",
]
}
}
# Controller NICs with public and private IPv4
resource "azurerm_network_interface" "controllers" {
count = "${var.controller_count}"
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "${var.cluster_name}-controller-${count.index}"
location = "${azurerm_resource_group.cluster.location}"
network_security_group_id = "${azurerm_network_security_group.controller.id}"
ip_configuration {
name = "ip0"
subnet_id = "${azurerm_subnet.controller.id}"
private_ip_address_allocation = "dynamic"
# public IPv4
public_ip_address_id = "${element(azurerm_public_ip.controllers.*.id, count.index)}"
}
}
# Add controller NICs to the controller backend address pool
resource "azurerm_network_interface_backend_address_pool_association" "controllers" {
network_interface_id = "${azurerm_network_interface.controllers.id}"
ip_configuration_name = "ip0"
backend_address_pool_id = "${azurerm_lb_backend_address_pool.controller.id}"
}
# Controller public IPv4 addresses
resource "azurerm_public_ip" "controllers" {
count = "${var.controller_count}"
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "${var.cluster_name}-controller-${count.index}"
location = "${azurerm_resource_group.cluster.location}"
sku = "Standard"
public_ip_address_allocation = "static"
}
# Controller Ignition configs
data "ct_config" "controller-ignitions" {
count = "${var.controller_count}"
content = "${element(data.template_file.controller-configs.*.rendered, count.index)}"
pretty_print = false
snippets = ["${var.controller_clc_snippets}"]
}
# Controller Container Linux configs
data "template_file" "controller-configs" {
count = "${var.controller_count}"
template = "${file("${path.module}/cl/controller.yaml.tmpl")}"
vars = {
# Cannot use cyclic dependencies on controllers or their DNS records
etcd_name = "etcd${count.index}"
etcd_domain = "${var.cluster_name}-etcd${count.index}.${var.dns_zone}"
# etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,...
etcd_initial_cluster = "${join(",", data.template_file.etcds.*.rendered)}"
kubeconfig = "${indent(10, module.bootkube.kubeconfig-kubelet)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
cluster_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
}
}
data "template_file" "etcds" {
count = "${var.controller_count}"
template = "etcd$${index}=https://$${cluster_name}-etcd$${index}.$${dns_zone}:2380"
vars {
index = "${count.index}"
cluster_name = "${var.cluster_name}"
dns_zone = "${var.dns_zone}"
}
}

View File

@ -0,0 +1,144 @@
# DNS record for the apiserver load balancer
resource "azurerm_dns_a_record" "apiserver" {
resource_group_name = "${var.dns_zone_group}"
# DNS Zone name where record should be created
zone_name = "${var.dns_zone}"
# DNS record
name = "${var.cluster_name}"
ttl = 300
# IPv4 address of apiserver load balancer
records = ["${azurerm_public_ip.apiserver-ipv4.ip_address}"]
}
# Static IPv4 address for the apiserver frontend
resource "azurerm_public_ip" "apiserver-ipv4" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "${var.cluster_name}-apiserver-ipv4"
location = "${var.region}"
sku = "Standard"
public_ip_address_allocation = "static"
}
# Static IPv4 address for the ingress frontend
resource "azurerm_public_ip" "ingress-ipv4" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "${var.cluster_name}-ingress-ipv4"
location = "${var.region}"
sku = "Standard"
public_ip_address_allocation = "static"
}
# Network Load Balancer for apiservers and ingress
resource "azurerm_lb" "cluster" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "${var.cluster_name}"
location = "${var.region}"
sku = "Standard"
frontend_ip_configuration {
name = "apiserver"
public_ip_address_id = "${azurerm_public_ip.apiserver-ipv4.id}"
}
frontend_ip_configuration {
name = "ingress"
public_ip_address_id = "${azurerm_public_ip.ingress-ipv4.id}"
}
}
resource "azurerm_lb_rule" "apiserver" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "apiserver"
loadbalancer_id = "${azurerm_lb.cluster.id}"
frontend_ip_configuration_name = "apiserver"
protocol = "Tcp"
frontend_port = 6443
backend_port = 6443
backend_address_pool_id = "${azurerm_lb_backend_address_pool.controller.id}"
probe_id = "${azurerm_lb_probe.apiserver.id}"
}
resource "azurerm_lb_rule" "ingress-http" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "ingress-http"
loadbalancer_id = "${azurerm_lb.cluster.id}"
frontend_ip_configuration_name = "ingress"
protocol = "Tcp"
frontend_port = 80
backend_port = 80
backend_address_pool_id = "${azurerm_lb_backend_address_pool.worker.id}"
probe_id = "${azurerm_lb_probe.ingress.id}"
}
resource "azurerm_lb_rule" "ingress-https" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "ingress-https"
loadbalancer_id = "${azurerm_lb.cluster.id}"
frontend_ip_configuration_name = "ingress"
protocol = "Tcp"
frontend_port = 443
backend_port = 443
backend_address_pool_id = "${azurerm_lb_backend_address_pool.worker.id}"
probe_id = "${azurerm_lb_probe.ingress.id}"
}
# Address pool of controllers
resource "azurerm_lb_backend_address_pool" "controller" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "controller"
loadbalancer_id = "${azurerm_lb.cluster.id}"
}
# Address pool of workers
resource "azurerm_lb_backend_address_pool" "worker" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "worker"
loadbalancer_id = "${azurerm_lb.cluster.id}"
}
# Health checks / probes
# TCP health check for apiserver
resource "azurerm_lb_probe" "apiserver" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "apiserver"
loadbalancer_id = "${azurerm_lb.cluster.id}"
protocol = "Tcp"
port = 6443
# unhealthy threshold
number_of_probes = 3
interval_in_seconds = 5
}
# HTTP health check for ingress
resource "azurerm_lb_probe" "ingress" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "ingress"
loadbalancer_id = "${azurerm_lb.cluster.id}"
protocol = "Http"
port = 10254
request_path = "/healthz"
# unhealthy threshold
number_of_probes = 3
interval_in_seconds = 5
}

View File

@ -0,0 +1,33 @@
# Organize cluster into a resource group
resource "azurerm_resource_group" "cluster" {
name = "${var.cluster_name}"
location = "${var.region}"
}
resource "azurerm_virtual_network" "network" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "${var.cluster_name}"
location = "${azurerm_resource_group.cluster.location}"
address_space = ["${var.host_cidr}"]
}
# Subnets - separate subnets for controller and workers because Azure
# network security groups are based on IPv4 CIDR rather than instance
# tags like GCP or security group membership like AWS
resource "azurerm_subnet" "controller" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "controller"
virtual_network_name = "${azurerm_virtual_network.network.name}"
address_prefix = "${cidrsubnet(var.host_cidr, 1, 0)}"
}
resource "azurerm_subnet" "worker" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "worker"
virtual_network_name = "${azurerm_virtual_network.network.name}"
address_prefix = "${cidrsubnet(var.host_cidr, 1, 1)}"
}

View File

@ -0,0 +1,36 @@
output "kubeconfig-admin" {
value = "${module.bootkube.kubeconfig-admin}"
}
# Outputs for Kubernetes Ingress
output "ingress_static_ipv4" {
value = "${azurerm_public_ip.ingress-ipv4.ip_address}"
description = "IPv4 address of the load balancer for distributing traffic to Ingress controllers"
}
# Outputs for worker pools
output "region" {
value = "${azurerm_resource_group.cluster.location}"
}
output "resource_group_name" {
value = "${azurerm_resource_group.cluster.name}"
}
output "subnet_id" {
value = "${azurerm_subnet.worker.id}"
}
output "security_group_id" {
value = "${azurerm_network_security_group.worker.id}"
}
output "backend_address_pool_id" {
value = "${azurerm_lb_backend_address_pool.worker.id}"
}
output "kubeconfig" {
value = "${module.bootkube.kubeconfig-kubelet}"
}

View File

@ -0,0 +1,25 @@
# Terraform version and plugin versions
terraform {
required_version = ">= 0.11.0"
}
provider "azurerm" {
version = "~> 1.19"
}
provider "local" {
version = "~> 1.0"
}
provider "null" {
version = "~> 1.0"
}
provider "template" {
version = "~> 1.0"
}
provider "tls" {
version = "~> 1.0"
}

View File

@ -0,0 +1,287 @@
# Controller security group
resource "azurerm_network_security_group" "controller" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "${var.cluster_name}-controller"
location = "${azurerm_resource_group.cluster.location}"
}
resource "azurerm_network_security_rule" "controller-ssh" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-ssh"
network_security_group_name = "${azurerm_network_security_group.controller.name}"
priority = "2000"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "22"
source_address_prefix = "*"
destination_address_prefix = "${azurerm_subnet.controller.address_prefix}"
}
resource "azurerm_network_security_rule" "controller-etcd" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-etcd"
network_security_group_name = "${azurerm_network_security_group.controller.name}"
priority = "2005"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "2379-2380"
source_address_prefix = "${azurerm_subnet.controller.address_prefix}"
destination_address_prefix = "${azurerm_subnet.controller.address_prefix}"
}
# Allow Prometheus to scrape etcd metrics
resource "azurerm_network_security_rule" "controller-etcd-metrics" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-etcd-metrics"
network_security_group_name = "${azurerm_network_security_group.controller.name}"
priority = "2010"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "2381"
source_address_prefix = "${azurerm_subnet.worker.address_prefix}"
destination_address_prefix = "${azurerm_subnet.controller.address_prefix}"
}
resource "azurerm_network_security_rule" "controller-apiserver" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-apiserver"
network_security_group_name = "${azurerm_network_security_group.controller.name}"
priority = "2015"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "6443"
source_address_prefix = "*"
destination_address_prefix = "${azurerm_subnet.controller.address_prefix}"
}
resource "azurerm_network_security_rule" "controller-flannel" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-flannel"
network_security_group_name = "${azurerm_network_security_group.controller.name}"
priority = "2020"
access = "Allow"
direction = "Inbound"
protocol = "Udp"
source_port_range = "*"
destination_port_range = "8472"
source_address_prefixes = ["${azurerm_subnet.controller.address_prefix}", "${azurerm_subnet.worker.address_prefix}"]
destination_address_prefix = "${azurerm_subnet.controller.address_prefix}"
}
# Allow Prometheus to scrape node-exporter daemonset
resource "azurerm_network_security_rule" "controller-node-exporter" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-node-exporter"
network_security_group_name = "${azurerm_network_security_group.controller.name}"
priority = "2025"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "9100"
source_address_prefix = "${azurerm_subnet.worker.address_prefix}"
destination_address_prefix = "${azurerm_subnet.controller.address_prefix}"
}
# Allow apiserver to access kubelet's for exec, log, port-forward
resource "azurerm_network_security_rule" "controller-kubelet" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-kubelet"
network_security_group_name = "${azurerm_network_security_group.controller.name}"
priority = "2030"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "10250"
# allow Prometheus to scrape kubelet metrics too
source_address_prefixes = ["${azurerm_subnet.controller.address_prefix}", "${azurerm_subnet.worker.address_prefix}"]
destination_address_prefix = "${azurerm_subnet.controller.address_prefix}"
}
# Override Azure AllowVNetInBound and AllowAzureLoadBalancerInBound
# https://docs.microsoft.com/en-us/azure/virtual-network/security-overview#default-security-rules
resource "azurerm_network_security_rule" "controller-allow-loadblancer" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-loadbalancer"
network_security_group_name = "${azurerm_network_security_group.controller.name}"
priority = "3000"
access = "Allow"
direction = "Inbound"
protocol = "*"
source_port_range = "*"
destination_port_range = "*"
source_address_prefix = "AzureLoadBalancer"
destination_address_prefix = "*"
}
resource "azurerm_network_security_rule" "controller-deny-all" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "deny-all"
network_security_group_name = "${azurerm_network_security_group.controller.name}"
priority = "3005"
access = "Deny"
direction = "Inbound"
protocol = "*"
source_port_range = "*"
destination_port_range = "*"
source_address_prefix = "*"
destination_address_prefix = "*"
}
# Worker security group
resource "azurerm_network_security_group" "worker" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "${var.cluster_name}-worker"
location = "${azurerm_resource_group.cluster.location}"
}
resource "azurerm_network_security_rule" "worker-ssh" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-ssh"
network_security_group_name = "${azurerm_network_security_group.worker.name}"
priority = "2000"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "22"
source_address_prefix = "${azurerm_subnet.controller.address_prefix}"
destination_address_prefix = "${azurerm_subnet.worker.address_prefix}"
}
resource "azurerm_network_security_rule" "worker-http" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-http"
network_security_group_name = "${azurerm_network_security_group.worker.name}"
priority = "2005"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "80"
source_address_prefix = "*"
destination_address_prefix = "${azurerm_subnet.worker.address_prefix}"
}
resource "azurerm_network_security_rule" "worker-https" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-https"
network_security_group_name = "${azurerm_network_security_group.worker.name}"
priority = "2010"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "443"
source_address_prefix = "*"
destination_address_prefix = "${azurerm_subnet.worker.address_prefix}"
}
resource "azurerm_network_security_rule" "worker-flannel" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-flannel"
network_security_group_name = "${azurerm_network_security_group.worker.name}"
priority = "2015"
access = "Allow"
direction = "Inbound"
protocol = "Udp"
source_port_range = "*"
destination_port_range = "8472"
source_address_prefixes = ["${azurerm_subnet.controller.address_prefix}", "${azurerm_subnet.worker.address_prefix}"]
destination_address_prefix = "${azurerm_subnet.worker.address_prefix}"
}
# Allow Prometheus to scrape node-exporter daemonset
resource "azurerm_network_security_rule" "worker-node-exporter" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-node-exporter"
network_security_group_name = "${azurerm_network_security_group.worker.name}"
priority = "2020"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "9100"
source_address_prefix = "${azurerm_subnet.worker.address_prefix}"
destination_address_prefix = "${azurerm_subnet.worker.address_prefix}"
}
# Allow apiserver to access kubelet's for exec, log, port-forward
resource "azurerm_network_security_rule" "worker-kubelet" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-kubelet"
network_security_group_name = "${azurerm_network_security_group.worker.name}"
priority = "2025"
access = "Allow"
direction = "Inbound"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "10250"
# allow Prometheus to scrape kubelet metrics too
source_address_prefixes = ["${azurerm_subnet.controller.address_prefix}", "${azurerm_subnet.worker.address_prefix}"]
destination_address_prefix = "${azurerm_subnet.worker.address_prefix}"
}
# Override Azure AllowVNetInBound and AllowAzureLoadBalancerInBound
# https://docs.microsoft.com/en-us/azure/virtual-network/security-overview#default-security-rules
resource "azurerm_network_security_rule" "worker-allow-loadblancer" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "allow-loadbalancer"
network_security_group_name = "${azurerm_network_security_group.worker.name}"
priority = "3000"
access = "Allow"
direction = "Inbound"
protocol = "*"
source_port_range = "*"
destination_port_range = "*"
source_address_prefix = "AzureLoadBalancer"
destination_address_prefix = "*"
}
resource "azurerm_network_security_rule" "worker-deny-all" {
resource_group_name = "${azurerm_resource_group.cluster.name}"
name = "deny-all"
network_security_group_name = "${azurerm_network_security_group.worker.name}"
priority = "3005"
access = "Deny"
direction = "Inbound"
protocol = "*"
source_port_range = "*"
destination_port_range = "*"
source_address_prefix = "*"
destination_address_prefix = "*"
}

View File

@ -0,0 +1,95 @@
# Secure copy etcd TLS assets to controllers.
resource "null_resource" "copy-controller-secrets" {
count = "${var.controller_count}"
depends_on = [
"azurerm_virtual_machine.controllers",
]
connection {
type = "ssh"
host = "${element(azurerm_public_ip.controllers.*.ip_address, count.index)}"
user = "core"
timeout = "15m"
}
provisioner "file" {
content = "${module.bootkube.etcd_ca_cert}"
destination = "$HOME/etcd-client-ca.crt"
}
provisioner "file" {
content = "${module.bootkube.etcd_client_cert}"
destination = "$HOME/etcd-client.crt"
}
provisioner "file" {
content = "${module.bootkube.etcd_client_key}"
destination = "$HOME/etcd-client.key"
}
provisioner "file" {
content = "${module.bootkube.etcd_server_cert}"
destination = "$HOME/etcd-server.crt"
}
provisioner "file" {
content = "${module.bootkube.etcd_server_key}"
destination = "$HOME/etcd-server.key"
}
provisioner "file" {
content = "${module.bootkube.etcd_peer_cert}"
destination = "$HOME/etcd-peer.crt"
}
provisioner "file" {
content = "${module.bootkube.etcd_peer_key}"
destination = "$HOME/etcd-peer.key"
}
provisioner "remote-exec" {
inline = [
"sudo mkdir -p /etc/ssl/etcd/etcd",
"sudo mv etcd-client* /etc/ssl/etcd/",
"sudo cp /etc/ssl/etcd/etcd-client-ca.crt /etc/ssl/etcd/etcd/server-ca.crt",
"sudo mv etcd-server.crt /etc/ssl/etcd/etcd/server.crt",
"sudo mv etcd-server.key /etc/ssl/etcd/etcd/server.key",
"sudo cp /etc/ssl/etcd/etcd-client-ca.crt /etc/ssl/etcd/etcd/peer-ca.crt",
"sudo mv etcd-peer.crt /etc/ssl/etcd/etcd/peer.crt",
"sudo mv etcd-peer.key /etc/ssl/etcd/etcd/peer.key",
"sudo chown -R etcd:etcd /etc/ssl/etcd",
"sudo chmod -R 500 /etc/ssl/etcd",
]
}
}
# Secure copy bootkube assets to ONE controller and start bootkube to perform
# one-time self-hosted cluster bootstrapping.
resource "null_resource" "bootkube-start" {
depends_on = [
"module.bootkube",
"module.workers",
"azurerm_dns_a_record.apiserver",
"null_resource.copy-controller-secrets",
]
connection {
type = "ssh"
host = "${element(azurerm_public_ip.controllers.*.ip_address, 0)}"
user = "core"
timeout = "15m"
}
provisioner "file" {
source = "${var.asset_dir}"
destination = "$HOME/assets"
}
provisioner "remote-exec" {
inline = [
"sudo mv $HOME/assets /opt/bootkube",
"sudo systemctl start bootkube",
]
}
}

View File

@ -0,0 +1,123 @@
variable "cluster_name" {
type = "string"
description = "Unique cluster name (prepended to dns_zone)"
}
# Azure
variable "region" {
type = "string"
description = "Azure Region (e.g. centralus , see `az account list-locations --output table`)"
}
variable "dns_zone" {
type = "string"
description = "Azure DNS Zone (e.g. azure.example.com)"
}
variable "dns_zone_group" {
type = "string"
description = "Resource group where the Azure DNS Zone resides (e.g. global)"
}
# instances
variable "controller_count" {
type = "string"
default = "1"
description = "Number of controllers (i.e. masters)"
}
variable "worker_count" {
type = "string"
default = "1"
description = "Number of workers"
}
variable "controller_type" {
type = "string"
default = "Standard_DS1_v2"
description = "Machine type for controllers (see `az vm list-skus --location centralus`)"
}
variable "worker_type" {
type = "string"
default = "Standard_F1"
description = "Machine type for workers (see `az vm list-skus --location centralus`)"
}
variable "os_image" {
type = "string"
default = "coreos-stable"
description = "Channel for a Container Linux derivative (coreos-stable, coreos-beta, coreos-alpha)"
}
variable "disk_size" {
type = "string"
default = "40"
description = "Size of the disk in GB"
}
variable "worker_priority" {
type = "string"
default = "Regular"
description = "Set worker priority to Low to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time."
}
variable "controller_clc_snippets" {
type = "list"
description = "Controller Container Linux Config snippets"
default = []
}
variable "worker_clc_snippets" {
type = "list"
description = "Worker Container Linux Config snippets"
default = []
}
# configuration
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key for user 'core'"
}
variable "asset_dir" {
description = "Path to a directory where generated assets should be placed (contains secrets)"
type = "string"
}
variable "host_cidr" {
description = "CIDR IPv4 range to assign to instances"
type = "string"
default = "10.0.0.0/16"
}
variable "pod_cidr" {
description = "CIDR IPv4 range to assign Kubernetes pods"
type = "string"
default = "10.2.0.0/16"
}
variable "service_cidr" {
description = <<EOD
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns.
EOD
type = "string"
default = "10.3.0.0/16"
}
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = "string"
default = "cluster.local"
}
variable "enable_reporting" {
type = "string"
description = "Enable usage or analytics reporting to upstreams (Calico)"
default = "false"
}

View File

@ -0,0 +1,23 @@
module "workers" {
source = "workers"
name = "${var.cluster_name}"
# Azure
resource_group_name = "${azurerm_resource_group.cluster.name}"
region = "${azurerm_resource_group.cluster.location}"
subnet_id = "${azurerm_subnet.worker.id}"
security_group_id = "${azurerm_network_security_group.worker.id}"
backend_address_pool_id = "${azurerm_lb_backend_address_pool.worker.id}"
count = "${var.worker_count}"
vm_type = "${var.worker_type}"
os_image = "${var.os_image}"
priority = "${var.worker_priority}"
# configuration
kubeconfig = "${module.bootkube.kubeconfig-kubelet}"
ssh_authorized_key = "${var.ssh_authorized_key}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
clc_snippets = "${var.worker_clc_snippets}"
}

View File

@ -0,0 +1,122 @@
---
systemd:
units:
- name: docker.service
enable: true
- name: locksmithd.service
mask: true
- name: wait-for-dns.service
enable: true
contents: |
[Unit]
Description=Wait for DNS entries
Wants=systemd-resolved.service
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/sh -c 'while ! /usr/bin/grep '^[^#[:space:]]' /etc/resolv.conf > /dev/null; do sleep 1; done'
[Install]
RequiredBy=kubelet.service
- name: kubelet.service
enable: true
contents: |
[Unit]
Description=Kubelet via Hyperkube
Wants=rpc-statd.service
[Service]
EnvironmentFile=/etc/kubernetes/kubelet.env
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/cache/kubelet-pod.uuid \
--volume=resolv,kind=host,source=/etc/resolv.conf \
--mount volume=resolv,target=/etc/resolv.conf \
--volume var-lib-cni,kind=host,source=/var/lib/cni \
--mount volume=var-lib-cni,target=/var/lib/cni \
--volume var-lib-calico,kind=host,source=/var/lib/calico \
--mount volume=var-lib-calico,target=/var/lib/calico \
--volume opt-cni-bin,kind=host,source=/opt/cni/bin \
--mount volume=opt-cni-bin,target=/opt/cni/bin \
--volume var-log,kind=host,source=/var/log \
--mount volume=var-log,target=/var/log \
--insecure-options=image"
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/mkdir -p /var/lib/calico
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni \
--node-labels=node-role.kubernetes.io/node \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
- name: delete-node.service
enable: true
contents: |
[Unit]
Description=Waiting to delete Kubernetes node on shutdown
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/true
ExecStop=/etc/kubernetes/delete-node
[Install]
WantedBy=multi-user.target
storage:
files:
- path: /etc/kubernetes/kubeconfig
filesystem: root
mode: 0644
contents:
inline: |
${kubeconfig}
- path: /etc/kubernetes/kubelet.env
filesystem: root
mode: 0644
contents:
inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.13.2
- path: /etc/sysctl.d/max-user-watches.conf
filesystem: root
contents:
inline: |
fs.inotify.max_user_watches=16184
- path: /etc/kubernetes/delete-node
filesystem: root
mode: 0744
contents:
inline: |
#!/bin/bash
set -e
exec /usr/bin/rkt run \
--trust-keys-from-https \
--volume config,kind=host,source=/etc/kubernetes \
--mount volume=config,target=/etc/kubernetes \
--insecure-options=image \
docker://k8s.gcr.io/hyperkube:v1.13.2 \
--net=host \
--dns=host \
--exec=/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname | tr '[:upper:]' '[:lower:]')
passwd:
users:
- name: core
ssh_authorized_keys:
- "${ssh_authorized_key}"

View File

@ -0,0 +1,91 @@
variable "name" {
type = "string"
description = "Unique name for the worker pool"
}
# Azure
variable "region" {
type = "string"
description = "Must be set to the Azure Region of cluster"
}
variable "resource_group_name" {
type = "string"
description = "Must be set to the resource group name of cluster"
}
variable "subnet_id" {
type = "string"
description = "Must be set to the `worker_subnet_id` output by cluster"
}
variable "security_group_id" {
type = "string"
description = "Must be set to the `worker_security_group_id` output by cluster"
}
variable "backend_address_pool_id" {
type = "string"
description = "Must be set to the `worker_backend_address_pool_id` output by cluster"
}
# instances
variable "count" {
type = "string"
default = "1"
description = "Number of instances"
}
variable "vm_type" {
type = "string"
default = "Standard_F1"
description = "Machine type for instances (see `az vm list-skus --location centralus`)"
}
variable "os_image" {
type = "string"
default = "coreos-stable"
description = "Channel for a Container Linux derivative (coreos-stable, coreos-beta, coreos-alpha)"
}
variable "priority" {
type = "string"
default = "Regular"
description = "Set priority to Low to use reduced cost surplus capacity, with the tradeoff that instances can be evicted at any time."
}
variable "clc_snippets" {
type = "list"
description = "Container Linux Config snippets"
default = []
}
# configuration
variable "kubeconfig" {
type = "string"
description = "Must be set to `kubeconfig` output by cluster"
}
variable "ssh_authorized_key" {
type = "string"
description = "SSH public key for user 'core'"
}
variable "service_cidr" {
description = <<EOD
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns.
EOD
type = "string"
default = "10.3.0.0/16"
}
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = "string"
default = "cluster.local"
}

View File

@ -0,0 +1,114 @@
locals {
# Channel for a Container Linux derivative
# coreos-stable -> Container Linux Stable
channel = "${element(split("-", var.os_image), 1)}"
}
# Workers scale set
resource "azurerm_virtual_machine_scale_set" "workers" {
resource_group_name = "${var.resource_group_name}"
name = "${var.name}-workers"
location = "${var.region}"
single_placement_group = false
sku {
name = "${var.vm_type}"
tier = "standard"
capacity = "${var.count}"
}
# boot
storage_profile_image_reference {
publisher = "CoreOS"
offer = "CoreOS"
sku = "${local.channel}"
version = "latest"
}
# storage
storage_profile_os_disk {
create_option = "FromImage"
caching = "ReadWrite"
os_type = "linux"
managed_disk_type = "Standard_LRS"
}
os_profile {
computer_name_prefix = "${var.name}-worker-"
admin_username = "core"
custom_data = "${data.ct_config.worker-ignition.rendered}"
}
# Azure mandates setting an ssh_key, even though Ignition custom_data handles it too
os_profile_linux_config {
disable_password_authentication = true
ssh_keys {
path = "/home/core/.ssh/authorized_keys"
key_data = "${var.ssh_authorized_key}"
}
}
# network
network_profile {
name = "nic0"
primary = true
network_security_group_id = "${var.security_group_id}"
ip_configuration {
name = "ip0"
primary = true
subnet_id = "${var.subnet_id}"
# backend address pool to which the NIC should be added
load_balancer_backend_address_pool_ids = ["${var.backend_address_pool_id}"]
}
}
# lifecycle
upgrade_policy_mode = "Manual"
priority = "${var.priority}"
eviction_policy = "Delete"
}
# Scale up or down to maintain desired number, tolerating deallocations.
resource "azurerm_autoscale_setting" "workers" {
resource_group_name = "${var.resource_group_name}"
name = "${var.name}-maintain-desired"
location = "${var.region}"
# autoscale
enabled = true
target_resource_id = "${azurerm_virtual_machine_scale_set.workers.id}"
profile {
name = "default"
capacity {
minimum = "${var.count}"
default = "${var.count}"
maximum = "${var.count}"
}
}
}
# Worker Ignition configs
data "ct_config" "worker-ignition" {
content = "${data.template_file.worker-config.rendered}"
pretty_print = false
snippets = ["${var.clc_snippets}"]
}
# Worker Container Linux configs
data "template_file" "worker-config" {
template = "${file("${path.module}/cl/worker.yaml.tmpl")}"
vars = {
kubeconfig = "${indent(10, var.kubeconfig)}"
ssh_authorized_key = "${var.ssh_authorized_key}"
cluster_dns_service_ip = "${cidrhost(var.service_cidr, 10)}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
}
}

0
azure/ignore/.gitkeep Normal file
View File

View File

@ -11,9 +11,10 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.10.3 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* Kubernetes v1.13.2 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)
## Docs

View File

@ -1,6 +1,6 @@
# Self-hosted Kubernetes assets (kubeconfig, manifests)
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=3fa3c2d73b57b2372c7c68e7db1cf82932ea1380"
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=bcbdddd8d07c99ab88b2e9ebfb662de4c104de0a"
cluster_name = "${var.cluster_name}"
api_servers = ["${var.k8s_domain_name}"]
@ -12,4 +12,5 @@ module "bootkube" {
pod_cidr = "${var.pod_cidr}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
enable_reporting = "${var.enable_reporting}"
}

View File

@ -7,7 +7,7 @@ systemd:
- name: 40-etcd-cluster.conf
contents: |
[Service]
Environment="ETCD_IMAGE_TAG=v3.3.5"
Environment="ETCD_IMAGE_TAG=v3.3.10"
Environment="ETCD_NAME=${etcd_name}"
Environment="ETCD_ADVERTISE_CLIENT_URLS=https://${domain_name}:2379"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=https://${domain_name}:2380"
@ -70,6 +70,10 @@ systemd:
--mount volume=opt-cni-bin,target=/opt/cni/bin \
--volume var-log,kind=host,source=/var/log \
--mount volume=var-log,target=/var/log \
--volume iscsiconf,kind=host,source=/etc/iscsi/ \
--mount volume=iscsiconf,target=/etc/iscsi/ \
--volume iscsiadm,kind=host,source=/usr/sbin/iscsiadm \
--mount volume=iscsiadm,target=/sbin/iscsiadm \
--insecure-options=image"
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
@ -82,12 +86,11 @@ systemd:
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
--allow-privileged \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${k8s_dns_service_ip} \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
@ -98,6 +101,7 @@ systemd:
--node-labels=node-role.kubernetes.io/master \
--node-labels=node-role.kubernetes.io/controller="true" \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--register-with-taints=node-role.kubernetes.io/master=:NoSchedule \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
@ -124,7 +128,7 @@ storage:
contents:
inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.10.3
KUBELET_IMAGE_TAG=v1.13.2
- path: /etc/hostname
filesystem: root
mode: 0644
@ -150,22 +154,17 @@ storage:
set -e
# Move experimental manifests
[ -n "$(ls /opt/bootkube/assets/manifests-*/* 2>/dev/null)" ] && mv /opt/bootkube/assets/manifests-*/* /opt/bootkube/assets/manifests && rm -rf /opt/bootkube/assets/manifests-*
BOOTKUBE_ACI="$${BOOTKUBE_ACI:-quay.io/coreos/bootkube}"
BOOTKUBE_VERSION="$${BOOTKUBE_VERSION:-v0.12.0}"
BOOTKUBE_ASSETS="$${BOOTKUBE_ASSETS:-/opt/bootkube/assets}"
exec /usr/bin/rkt run \
--trust-keys-from-https \
--volume assets,kind=host,source=$BOOTKUBE_ASSETS \
--volume assets,kind=host,source=/opt/bootkube/assets \
--mount volume=assets,target=/assets \
--volume bootstrap,kind=host,source=/etc/kubernetes \
--mount volume=bootstrap,target=/etc/kubernetes \
$$RKT_OPTS \
$${BOOTKUBE_ACI}:$${BOOTKUBE_VERSION} \
quay.io/coreos/bootkube:v0.14.0 \
--net=host \
--dns=host \
--exec=/bootkube -- start --asset-dir=/assets "$@"
networkd:
${networkd_content}
passwd:
users:
- name: core

View File

@ -45,6 +45,10 @@ systemd:
--mount volume=opt-cni-bin,target=/opt/cni/bin \
--volume var-log,kind=host,source=/var/log \
--mount volume=var-log,target=/var/log \
--volume iscsiconf,kind=host,source=/etc/iscsi/ \
--mount volume=iscsiconf,target=/etc/iscsi/ \
--volume iscsiadm,kind=host,source=/usr/sbin/iscsiadm \
--mount volume=iscsiadm,target=/sbin/iscsiadm \
--insecure-options=image"
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
@ -55,12 +59,11 @@ systemd:
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
--allow-privileged \
--anonymous-auth=false \
--authentication-token-webhook \
--authorization-mode=Webhook \
--client-ca-file=/etc/kubernetes/ca.crt \
--cluster_dns=${k8s_dns_service_ip} \
--cluster_dns=${cluster_dns_service_ip} \
--cluster_domain=${cluster_domain_suffix} \
--cni-conf-dir=/etc/kubernetes/cni/net.d \
--exit-on-lock-contention \
@ -70,6 +73,7 @@ systemd:
--network-plugin=cni \
--node-labels=node-role.kubernetes.io/node \
--pod-manifest-path=/etc/kubernetes/manifests \
--read-only-port=0 \
--volume-plugin-dir=/var/lib/kubelet/volumeplugins
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid
Restart=always
@ -85,7 +89,7 @@ storage:
contents:
inline: |
KUBELET_IMAGE_URL=docker://k8s.gcr.io/hyperkube
KUBELET_IMAGE_TAG=v1.10.3
KUBELET_IMAGE_TAG=v1.13.2
- path: /etc/hostname
filesystem: root
mode: 0644
@ -97,8 +101,6 @@ storage:
contents:
inline: |
fs.inotify.max_user_watches=16184
networkd:
${networkd_content}
passwd:
users:
- name: core

View File

@ -1,9 +1,9 @@
resource "matchbox_group" "install" {
count = "${length(var.controller_names) + length(var.worker_names)}"
name = "${format("install-%s", element(concat(var.controller_names, var.worker_names), count.index))}"
name = "${format("install-%s", element(concat(var.controller_names, var.worker_names), count.index))}"
profile = "${local.flavor == "flatcar" ? element(matchbox_profile.flatcar-install.*.name, count.index) : var.cached_install == "true" ? element(matchbox_profile.cached-container-linux-install.*.name, count.index) : element(matchbox_profile.container-linux-install.*.name, count.index)}"
profile = "${local.flavor == "flatcar" ? var.cached_install == "true" ? element(matchbox_profile.cached-flatcar-linux-install.*.name, count.index) : element(matchbox_profile.flatcar-install.*.name, count.index) : var.cached_install == "true" ? element(matchbox_profile.cached-container-linux-install.*.name, count.index) : element(matchbox_profile.container-linux-install.*.name, count.index)}"
selector {
mac = "${element(concat(var.controller_macs, var.worker_macs), count.index)}"

View File

@ -1,3 +1,3 @@
output "kubeconfig" {
value = "${module.bootkube.kubeconfig}"
output "kubeconfig-admin" {
value = "${module.bootkube.kubeconfig-admin}"
}

View File

@ -1,8 +1,9 @@
locals {
# coreos-stable -> coreos flavor, stable channel
# flatcar-stable -> flatcar flavor, stable channel
flavor = "${element(split("-", var.os_channel), 0)}"
channel = "${element(split("-", var.os_channel), 1)}"
flavor = "${element(split("-", var.os_channel), 0)}"
channel = "${element(split("-", var.os_channel), 1)}"
}
// Container Linux Install profile (from release.core-os.net)
@ -48,7 +49,7 @@ data "template_file" "container-linux-install-configs" {
}
// Container Linux Install profile (from matchbox /assets cache)
// Note: Admin must have downloaded os_version into matchbox assets.
// Note: Admin must have downloaded os_version into matchbox assets/coreos.
resource "matchbox_profile" "cached-container-linux-install" {
count = "${length(var.controller_names) + length(var.worker_names)}"
name = "${format("%s-cached-container-linux-install-%s", var.cluster_name, element(concat(var.controller_names, var.worker_names), count.index))}"
@ -86,7 +87,7 @@ data "template_file" "cached-container-linux-install-configs" {
ssh_authorized_key = "${var.ssh_authorized_key}"
# profile uses -b baseurl to install from matchbox cache
baseurl_flag = "-b ${var.matchbox_http_endpoint}/assets/coreos"
baseurl_flag = "-b ${var.matchbox_http_endpoint}/assets/${local.flavor}"
}
}
@ -113,11 +114,44 @@ resource "matchbox_profile" "flatcar-install" {
container_linux_config = "${element(data.template_file.container-linux-install-configs.*.rendered, count.index)}"
}
// Flatcar Linux Install profile (from matchbox /assets cache)
// Note: Admin must have downloaded os_version into matchbox assets/flatcar.
resource "matchbox_profile" "cached-flatcar-linux-install" {
count = "${length(var.controller_names) + length(var.worker_names)}"
name = "${format("%s-cached-flatcar-linux-install-%s", var.cluster_name, element(concat(var.controller_names, var.worker_names), count.index))}"
kernel = "/assets/flatcar/${var.os_version}/flatcar_production_pxe.vmlinuz"
initrd = [
"/assets/flatcar/${var.os_version}/flatcar_production_pxe_image.cpio.gz",
]
args = [
"initrd=flatcar_production_pxe_image.cpio.gz",
"flatcar.config.url=${var.matchbox_http_endpoint}/ignition?uuid=$${uuid}&mac=$${mac:hexhyp}",
"flatcar.first_boot=yes",
"console=tty0",
"console=ttyS0",
"${var.kernel_args}",
]
container_linux_config = "${element(data.template_file.cached-container-linux-install-configs.*.rendered, count.index)}"
}
// Kubernetes Controller profiles
resource "matchbox_profile" "controllers" {
count = "${length(var.controller_names)}"
name = "${format("%s-controller-%s", var.cluster_name, element(var.controller_names, count.index))}"
container_linux_config = "${element(data.template_file.controller-configs.*.rendered, count.index)}"
count = "${length(var.controller_names)}"
name = "${format("%s-controller-%s", var.cluster_name, element(var.controller_names, count.index))}"
raw_ignition = "${element(data.ct_config.controller-ignitions.*.rendered, count.index)}"
}
data "ct_config" "controller-ignitions" {
count = "${length(var.controller_names)}"
content = "${element(data.template_file.controller-configs.*.rendered, count.index)}"
pretty_print = false
# Must use direct lookup. Cannot use lookup(map, key) since it only works for flat maps
snippets = ["${local.clc_map[element(var.controller_names, count.index)]}"]
}
data "template_file" "controller-configs" {
@ -126,23 +160,29 @@ data "template_file" "controller-configs" {
template = "${file("${path.module}/cl/controller.yaml.tmpl")}"
vars {
domain_name = "${element(var.controller_domains, count.index)}"
etcd_name = "${element(var.controller_names, count.index)}"
etcd_initial_cluster = "${join(",", formatlist("%s=https://%s:2380", var.controller_names, var.controller_domains))}"
k8s_dns_service_ip = "${module.bootkube.kube_dns_service_ip}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
ssh_authorized_key = "${var.ssh_authorized_key}"
# Terraform evaluates both sides regardless and element cannot be used on 0 length lists
networkd_content = "${length(var.controller_networkds) == 0 ? "" : element(concat(var.controller_networkds, list("")), count.index)}"
domain_name = "${element(var.controller_domains, count.index)}"
etcd_name = "${element(var.controller_names, count.index)}"
etcd_initial_cluster = "${join(",", formatlist("%s=https://%s:2380", var.controller_names, var.controller_domains))}"
cluster_dns_service_ip = "${module.bootkube.cluster_dns_service_ip}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
ssh_authorized_key = "${var.ssh_authorized_key}"
}
}
// Kubernetes Worker profiles
resource "matchbox_profile" "workers" {
count = "${length(var.worker_names)}"
name = "${format("%s-worker-%s", var.cluster_name, element(var.worker_names, count.index))}"
container_linux_config = "${element(data.template_file.worker-configs.*.rendered, count.index)}"
count = "${length(var.worker_names)}"
name = "${format("%s-worker-%s", var.cluster_name, element(var.worker_names, count.index))}"
raw_ignition = "${element(data.ct_config.worker-ignitions.*.rendered, count.index)}"
}
data "ct_config" "worker-ignitions" {
count = "${length(var.worker_names)}"
content = "${element(data.template_file.worker-configs.*.rendered, count.index)}"
pretty_print = false
# Must use direct lookup. Cannot use lookup(map, key) since it only works for flat maps
snippets = ["${local.clc_map[element(var.worker_names, count.index)]}"]
}
data "template_file" "worker-configs" {
@ -151,12 +191,25 @@ data "template_file" "worker-configs" {
template = "${file("${path.module}/cl/worker.yaml.tmpl")}"
vars {
domain_name = "${element(var.worker_domains, count.index)}"
k8s_dns_service_ip = "${module.bootkube.kube_dns_service_ip}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
ssh_authorized_key = "${var.ssh_authorized_key}"
# Terraform evaluates both sides regardless and element cannot be used on 0 length lists
networkd_content = "${length(var.worker_networkds) == 0 ? "" : element(concat(var.worker_networkds, list("")), count.index)}"
domain_name = "${element(var.worker_domains, count.index)}"
cluster_dns_service_ip = "${module.bootkube.cluster_dns_service_ip}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
ssh_authorized_key = "${var.ssh_authorized_key}"
}
}
locals {
# Hack to workaround https://github.com/hashicorp/terraform/issues/17251
# Default Container Linux config snippets map every node names to list("\n") so
# all lookups succeed
clc_defaults = "${zipmap(concat(var.controller_names, var.worker_names), chunklist(data.template_file.clc-default-snippets.*.rendered, 1))}"
# Union of the default and user specific snippets, later overrides prior.
clc_map = "${merge(local.clc_defaults, var.clc_snippets)}"
}
// Horrible hack to generate a Terraform list of node count length
data "template_file" "clc-default-snippets" {
count = "${length(var.controller_names) + length(var.worker_names)}"
template = "\n"
}

View File

@ -2,6 +2,14 @@
resource "null_resource" "copy-controller-secrets" {
count = "${length(var.controller_names)}"
# Without depends_on, remote-exec could start and wait for machines before
# matchbox groups are written, causing a deadlock.
depends_on = [
"matchbox_group.install",
"matchbox_group.controller",
"matchbox_group.worker",
]
connection {
type = "ssh"
host = "${element(var.controller_domains, count.index)}"
@ -10,7 +18,7 @@ resource "null_resource" "copy-controller-secrets" {
}
provisioner "file" {
content = "${module.bootkube.kubeconfig}"
content = "${module.bootkube.kubeconfig-kubelet}"
destination = "$HOME/kubeconfig"
}
@ -70,6 +78,14 @@ resource "null_resource" "copy-controller-secrets" {
resource "null_resource" "copy-worker-secrets" {
count = "${length(var.worker_names)}"
# Without depends_on, remote-exec could start and wait for machines before
# matchbox groups are written, causing a deadlock.
depends_on = [
"matchbox_group.install",
"matchbox_group.controller",
"matchbox_group.worker",
]
connection {
type = "ssh"
host = "${element(var.worker_domains, count.index)}"
@ -78,7 +94,7 @@ resource "null_resource" "copy-worker-secrets" {
}
provisioner "file" {
content = "${module.bootkube.kubeconfig}"
content = "${module.bootkube.kubeconfig-kubelet}"
destination = "$HOME/kubeconfig"
}

View File

@ -24,27 +24,39 @@ variable "os_version" {
# Terraform's crude "type system" does not properly support lists of maps so we do this.
variable "controller_names" {
type = "list"
type = "list"
description = "Ordered list of controller names (e.g. [node1])"
}
variable "controller_macs" {
type = "list"
type = "list"
description = "Ordered list of controller identifying MAC addresses (e.g. [52:54:00:a1:9c:ae])"
}
variable "controller_domains" {
type = "list"
type = "list"
description = "Ordered list of controller FQDNs (e.g. [node1.example.com])"
}
variable "worker_names" {
type = "list"
type = "list"
description = "Ordered list of worker names (e.g. [node2, node3])"
}
variable "worker_macs" {
type = "list"
type = "list"
description = "Ordered list of worker identifying MAC addresses (e.g. [52:54:00:b2:2f:86, 52:54:00:c3:61:77])"
}
variable "worker_domains" {
type = "list"
type = "list"
description = "Ordered list of worker FQDNs (e.g. [node2.example.com, node3.example.com])"
}
variable "clc_snippets" {
type = "map"
description = "Map from machine names to lists of Container Linux Config snippets"
default = {}
}
# configuration
@ -91,7 +103,7 @@ variable "pod_cidr" {
variable "service_cidr" {
description = <<EOD
CIDR IPv4 range to assign Kubernetes services.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for kube-dns.
The 1st IP will be reserved for kube_apiserver, the 10th IP will be reserved for coredns.
EOD
type = "string"
@ -101,7 +113,7 @@ EOD
# optional
variable "cluster_domain_suffix" {
description = "Queries for domains with the suffix will be answered by kube-dns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
description = "Queries for domains with the suffix will be answered by coredns. Default is cluster.local (e.g. foo.default.svc.cluster.local) "
type = "string"
default = "cluster.local"
}
@ -130,16 +142,8 @@ variable "kernel_args" {
default = []
}
# unofficial, undocumented, unsupported, temporary
variable "controller_networkds" {
type = "list"
description = "Controller Container Linux config networkd section"
default = []
}
variable "worker_networkds" {
type = "list"
description = "Worker Container Linux config networkd section"
default = []
variable "enable_reporting" {
type = "string"
description = "Enable usage or analytics reporting to upstreams (Calico)"
default = "false"
}

View File

@ -11,8 +11,8 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster
## Features <a href="https://www.cncf.io/certification/software-conformance/"><img align="right" src="https://storage.googleapis.com/poseidon/certified-kubernetes.png"></a>
* Kubernetes v1.10.3 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, workloads isolated on workers, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* Kubernetes v1.13.2 (upstream, via [kubernetes-incubator/bootkube](https://github.com/kubernetes-incubator/bootkube))
* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking
* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/)
* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/)

Some files were not shown because too many files have changed in this diff Show More