typhoon

mirror of https://github.com/puppetmaster/typhoon.git synced 2025-09-16 21:39:44 +02:00

Author	SHA1	Message	Date
Dalton Hubble	be29f52039	Add enable_aggregation option (defaults to false) * Add an `enable_aggregation` variable to enable the kube-apiserver aggregation layer for adding extension apiservers to clusters * Aggregation is disabled by default. Typhoon recommends you not enable aggregation. Consider whether less invasive ways to achieve your goals are possible and whether those goals are well-founded * Enabling aggregation and extension apiservers increases the attack surface of a cluster and makes extensions a part of the control plane. Admins must scrutinize and trust any extension apiserver used. * Passing a v1.14 CNCF conformance test requires aggregation be enabled. Having an option for aggregation keeps compliance, but retains the stricter security posture on default clusters	2019-04-07 12:00:38 -07:00
Dalton Hubble	5271e410eb	Update Kubernetes from v1.13.5 to v1.14.0 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md#v1140	2019-04-07 00:15:59 -07:00
Dalton Hubble	aaa8e0261a	Add Google Cloud worker instances to a target pool * Background: A managed instance group of workers is used in backend services for global load balancing (HTTP/HTTPS Ingress) and output for custom global load balancing use cases * Add worker instances to a target pool load balancing TCP/UDP applications (NodePort or proxied). Output as `worker_target_pool` * Health check for workers with a healthy Ingress controller. Forward rules (regional) to target pools don't support different external and internal ports so choosing nodes with Ingress allows proxying as a workaround * A target pool is a logical grouping only. It doesn't add costs to clusters or worker pools	2019-04-01 21:03:48 -07:00
Dalton Hubble	b3ec5f73e3	Update Calico from v3.6.0 to v3.6.1 * https://docs.projectcalico.org/v3.6/release-notes/	2019-03-31 17:43:43 -07:00
Dalton Hubble	46196af500	Remove Haswell minimum CPU platform requirement * Google Cloud API implements `min_cpu_platform` to mean "use exactly this CPU" * Fix error creating clusters in newer regions lacking Haswell platform (e.g. europe-west2) (#438) * Reverts #405, added in v1.13.4 * Original goal of ignoring old Ivy/Sandy bridge CPUs in older regions will be achieved shortly anyway. Google Cloud is deprecating those CPUs in April 2019 * https://cloud.google.com/compute/docs/instances/specify-min-cpu-platform#how_selecting_a_minimum_cpu_platform_works	2019-03-27 19:51:32 -07:00
Dalton Hubble	4fea526ebf	Update Kubernetes from v1.13.4 to v1.13.5 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1135	2019-03-25 21:43:47 -07:00
Dalton Hubble	1feefbe9c6	Update Calico from v3.5.2 to v3.6.0 * Add calico-ipam CRDs and RBAC permissions * Switch IPAM from host-local to calico-ipam * `calico-ipam` subnets `ippools` (defaults to pod CIDR) into `ipamblocks` (defaults to /26, but set to /24 in Typhoon) * `host-local` subnets the pod CIDR based on the node PodCIDR field (set via kube-controller-manager as /24's) * Create a custom default IPv4 IPPool to ensure the block size is kept at /24 to allow 110 pods per node (Kubernetes default) * Retaining host-local was slightly preferred, but Calico v3.6 is migrating all usage to calico-ipam. The codepath that skipped calico-ipam for KDD was removed * https://docs.projectcalico.org/v3.6/release-notes/	2019-03-19 22:49:56 -07:00
Dalton Hubble	2019177b6b	Fix implicit map assignments to be explicit * Terraform v0.12 will require map assignments be explicit, part of v0.12 readiness	2019-03-12 01:19:54 -07:00
Dalton Hubble	deec512c14	Resolve in-addr.arpa and ip6.arpa zones with CoreDNS kubernetes plugin * Resolve in-addr.arpa and ip6.arpa DNS PTR requests for Kubernetes service IPs and pod IPs * Previously, CoreDNS was configured to resolve in-addr.arpa PTR records for service IPs (but not pod IPs)	2019-03-04 23:03:00 -08:00
Dalton Hubble	f598307998	Update Kubernetes from v1.13.3 to v1.13.4 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1134	2019-02-28 22:47:43 -08:00
Dalton Hubble	73ae5d5649	Update Calico from v3.5.1 to v3.5.2 * https://docs.projectcalico.org/v3.5/releases/	2019-02-25 21:23:13 -08:00
Dalton Hubble	42d7222f3d	Add a readiness probe to CoreDNS * https://github.com/poseidon/terraform-render-bootkube/pull/115	2019-02-23 13:25:23 -08:00
Dalton Hubble	7f8572030d	Upgrade to support terraform-provider-google v2.0+ * Support terraform-provider-google v1.19.0, v1.19.1, v1.20.0 and v2.0+ (and allow for future 2.x.y releases) * Require terraform-provider-google v1.19.0 or newer. v1.19.0 introduced `network_interface` fields `network_ip` and `nat_ip` to deprecate `address` and `assigned_nat_ip`. Those deprecated fields are removed in terraform-provider-google v2.0 * https://github.com/terraform-providers/terraform-provider-google/releases/tag/v2.0.0	2019-02-20 02:33:32 -08:00
Dalton Hubble	4294bd0292	Assign Pod Priority classes to critical cluster and node components * Assign pod priorityClassNames to critical cluster and node components (higher is higher priority) to inform node out-of-resource eviction order and scheduler preemption and scheduling order * Priority Admission Controller has been enabled since Typhoon v1.11.1	2019-02-19 22:21:39 -08:00
Dalton Hubble	ba4c5de052	Set the Google Cloud minimum CPU platform to Intel Haswell * Intel Haswell or better is available in every zone around the world * Neither Kubernetes nor Typhoon have a particular minimum processor family. However, a few Google Cloud zones still default to Sandy/Ivy bridge (scheduled to shift April 2019). Price is only based on machine type so it is beneficial to opt for the next processor family * Intel Haswell is a suitable minimum since it still allows plenty of liberty in choosing any region or machine type * Likely a slight increase to preemption probability in a few zones, but any lower probability on Sandy/Ivy bridge is due to lower desirability as they're phased out * https://cloud.google.com/compute/docs/regions-zones/	2019-02-18 12:55:04 -08:00
Dalton Hubble	584088397c	Update etcd from v3.3.11 to v3.3.12 * https://github.com/etcd-io/etcd/releases/tag/v3.3.12	2019-02-09 11:54:54 -08:00
Dalton Hubble	0200058e0e	Update Calico from v3.5.0 to v3.5.1 * Fix in confd https://github.com/projectcalico/confd/pull/205	2019-02-09 11:49:31 -08:00
Dalton Hubble	ccd96c37da	Update Kubernetes from v1.13.2 to v1.13.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1133	2019-02-01 23:26:13 -08:00
Dalton Hubble	244a1a601a	Switch CoreDNS to use the forward plugin instead of proxy * Use the forward plugin to forward to upstream resolvers, instead of the proxy plugin. The forward plugin is reported to be a faster alternative since it can re-use open sockets * https://coredns.io/explugins/forward/ * https://coredns.io/plugins/proxy/ * https://github.com/kubernetes/kubernetes/issues/73254	2019-01-30 22:25:23 -08:00
Dalton Hubble	1ab06f69d7	Update flannel from v0.10.0 to v0.11.0 * https://github.com/coreos/flannel/releases/tag/v0.11.0	2019-01-29 21:51:25 -08:00
Dalton Hubble	e9659a8539	Update Calico from v3.4.0 to v3.5.0 * https://docs.projectcalico.org/v3.5/releases/	2019-01-27 16:34:30 -08:00
Dalton Hubble	f4d3508578	Update CoreDNS from v1.3.0 to v1.3.1 * https://coredns.io/2019/01/13/coredns-1.3.1-release/	2019-01-15 22:50:25 -08:00
Dalton Hubble	7eafa59d8f	Fix instance shutdown automatic worker deletion on clouds * Fix a regression caused by lowering the Kubelet TLS client certificate to system:nodes group (#100) since dropping cluster-admin dropped the Kubelet's ability to delete nodes. * On clouds where workers can scale down (manual terraform apply, AWS spot termination, Azure low priority deletion), worker shutdown runs the delete-node.service to remove a node to prevent NotReady nodes from accumulating * Allow Kubelets to delete cluster nodes via system:nodes group. Kubelets acting with system:node and kubelet-delete ClusterRoles is still an improvement over acting as cluster-admin	2019-01-14 23:27:48 -08:00
Dalton Hubble	b74cc8afd2	Update etcd from v3.3.10 to v3.3.11 * https://github.com/etcd-io/etcd/releases/tag/v3.3.11	2019-01-12 14:17:25 -08:00
Dalton Hubble	4d32b79c6f	Update Kubernetes from v1.13.1 to v1.13.2 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1132	2019-01-12 00:00:53 -08:00
Dalton Hubble	df4c0ba05d	Use HTTPS liveness probes for kube-scheduler and kube-controller-manager * Disable kube-scheduler and kube-controller-manager HTTP ports	2019-01-09 20:56:50 -08:00
Dalton Hubble	bfe0c74793	Enable the certificates.k8s.io API to issue cluster certificates * System components that require certificates signed by the cluster CA can submit a CSR to the apiserver, have an administrator inspect and approve it, and be issued a certificate * Configure kube-controller-manager to sign Approved CSR's using the cluster CA private key * Admins are responsible for approving or denying CSRs, otherwise, no certificate is issued. Read the Kubernetes docs carefully and verify the entity making the request and the authorization level * https://kubernetes.io/docs/tasks/tls/managing-tls-in-a-cluster	2019-01-06 17:33:37 -08:00
Dalton Hubble	60c70797ec	Use a single format of the admin kubeconfig * Use a single admin kubeconfig for initial bootkube bootstrap and for use by a human admin. Previously, an admin kubeconfig without a named context was used for bootstrap and direct usage with KUBECONFIG=path, while one with a named context was used for `kubectl config use-context` style usage. Confusing. * Provide the admin kubeconfig via `assets/auth/kubeconfig`, `assets/auth/CLUSTER-config`, or output `kubeconfig-admin`	2019-01-05 14:57:18 -08:00
Dalton Hubble	6795a753ea	Update CoreDNS from v1.2.6 to v1.3.0 * https://coredns.io/2018/12/15/coredns-1.3.0-release/	2019-01-05 13:35:03 -08:00
Dalton Hubble	b57273b6f1	Rename internal kube_dns_service_ip to cluster_dns_service_ip * terraform-render-bootkube module deprecated kube_dns_service_ip output in favor of cluster_dns_service_ip * Rename k8s_dns_service_ip to cluster_dns_service_ip for consistency too	2019-01-05 13:32:03 -08:00
Dalton Hubble	812a1adb49	Use a lower-privilege Kubelet kubeconfig in system:nodes * Kubelets can use a lower-privilege TLS client certificate with Org system:nodes and a binding to the system:node ClusterRole * Admin kubeconfig's continue to belong to Org system:masters to provide cluster-admin (available in assets/auth/kubeconfig or as a Terraform output kubeconfig-admin) * Remove bare-metal output variable kubeconfig	2019-01-05 13:08:56 -08:00
Dalton Hubble	66e1365cc4	Add ServiceAccounts for kube-apiserver and kube-scheduler * Add ServiceAccounts and ClusterRoleBindings for kube-apiserver and kube-scheduler * Remove the ClusterRoleBinding for the kube-system default ServiceAccount * Rename the CA certificate CommonName for consistency with upstream	2019-01-01 20:16:14 -08:00
Dalton Hubble	bcb200186d	Add admin kubeconfig as a Terraform output * May be used to write a local file	2018-12-15 22:52:28 -08:00
Dalton Hubble	479d498024	Update Calico from v3.3.2 to v3.4.0 * https://docs.projectcalico.org/v3.4/releases/	2018-12-15 18:05:16 -08:00
Dalton Hubble	e0c032be94	Increase GCP TCP proxy apiserver backend timeout to 5 minutes * On GCP, kubectl port-forward connections to pods are closed after a timeout (unlike AWS NLB's or Azure load balancers) * Increase the GCP apiserver backend service timeout from 1 minute to 5 minutes to be more similar to AWS/Azure LB behavior	2018-12-15 17:34:18 -08:00
Dalton Hubble	018c5edc25	Update Kubernetes from v1.13.0 to v1.13.1 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1131	2018-12-15 11:44:57 -08:00
Dalton Hubble	ff6ab571f3	Update Calico from v3.3.1 to v3.3.2 * https://docs.projectcalico.org/v3.3/releases/	2018-12-06 22:56:55 -08:00
Dalton Hubble	d31f444fcd	Update Kubernetes from v1.12.3 to v1.13.0	2018-12-03 20:44:32 -08:00
Dalton Hubble	76d993cdae	Add experimental kube-router CNI provider * Add kube-router for pod networking and NetworkPolicy as an experiment * Experiments are not documented or supported in any way, and may be removed without notice. They have known issues and aren't enabled without special options.	2018-12-03 19:52:28 -08:00
Dalton Hubble	64b4c10418	Improve features and modules list docs * Remove bullet about isolating workloads on workers, its now common practice and new users will assume it * List advanced features available in each module * Fix erroneous Kubernetes version listing for Google Cloud Fedora Atomic	2018-11-26 22:58:00 -08:00
Dalton Hubble	5b27d8d889	Update Kubernetes from v1.12.2 to v1.12.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.12.md/#v1123	2018-11-26 21:06:09 -08:00
Dalton Hubble	840b73f9ba	Update pod-checkpointer image to query Kubelet secure API * Updates pod-checkpointer to prefer the Kubelet secure API (before falling back to the Kubelet read-only API that is disabled on Typhoon clusters since https://github.com/poseidon/typhoon/pull/324) * Previously, pod-checkpointer checkpointed an initial set of pods during bootstrapping so recovery from power cycling clusters was unaffected, but logs were noisy * https://github.com/kubernetes-incubator/bootkube/pull/1027 * https://github.com/kubernetes-incubator/bootkube/pull/1025	2018-11-26 20:24:32 -08:00
Dalton Hubble	915af3c6cc	Fix Calico Felix reporting usage data, require opt-in * Calico Felix has been reporting anonymous usage data about the version and cluster size, which violates Typhoon's privacy policy where analytics should be opt-in only * Add a variable enable_reporting (default: false) to allow opting in to reporting usage data to Calico (or future components)	2018-11-20 01:03:00 -08:00
Dalton Hubble	ea3fc6d2a7	Update CoreDNS from v1.2.4 to v1.2.6 * https://coredns.io/2018/11/05/coredns-1.2.6-release/	2018-11-18 16:45:53 -08:00
Dalton Hubble	56e9a82984	Add flannel resource request and mount only /run/flannel	2018-11-11 20:35:21 -08:00
Dalton Hubble	e95b856a22	Enable CoreDNS loop and loadbalance plugins * loop sends an initial query to detect infinite forwarding loops in configured upstream DNS servers and fast exit with an error (its a fatal misconfiguration on the network that will otherwise cause resolvers to consume memory/CPU until crashing, masking the problem) * https://github.com/coredns/coredns/tree/master/plugin/loop * loadbalance randomizes the ordering of A, AAAA, and MX records in responses to provide round-robin load balancing (as usual, clients may still cache responses though) * https://github.com/coredns/coredns/tree/master/plugin/loadbalance	2018-11-10 17:36:56 -08:00
Dalton Hubble	2b3f61d1bb	Update Calico from v3.3.0 to v3.3.1 * Structure Calico and flannel manifests * Rename kube-flannel mentions to just flannel	2018-11-10 13:37:12 -08:00
Dalton Hubble	8fd2978c31	Update bootkube image version from v0.13.0 to v0.14.0 * https://github.com/kubernetes-incubator/bootkube/releases/tag/v0.14.0	2018-11-06 23:35:11 -08:00
Dalton Hubble	721c847943	Set kube-apiserver kubelet preferred address types * Prefer InternalIP and ExternalIP over the node's hostname, to match upstream behavior and kubeadm * Previously, hostname-override was used to set node names to internal IP's to work around some cloud providers not resolving hostnames for instances (e.g. DO droplets)	2018-11-03 22:31:55 -07:00
Dalton Hubble	0e71f7e565	Ignore controller user_data changes to allow plugin updates * Updating the `terraform-provider-ct` plugin is known to produce a `user_data` diff in all pre-existing clusters. Applying the diff to pre-existing cluster destroys controller nodes * Ignore changes to controller `user_data`. Once all managed clusters use a release containing this change, it is possible to update the `terraform-provider-ct` plugin (worker `user_data` will still be modified) * Changing the module `ref` for an existing cluster and re-applying is still NOT supported (although this PR would protect controllers from being destroyed)	2018-10-28 16:48:12 -07:00

1 2 3 4 5 ...

251 Commits