typhoon

mirror of https://github.com/puppetmaster/typhoon.git synced 2025-10-04 12:34:37 +02:00

Author	SHA1	Message	Date
Dalton Hubble	43fe78a2cc	Raise scheduler/controller-manager replicas in multi-master * Continue to ensure scheduler and controller-manager run at least two replicas to support performing kubectl edits on single-master clusters (no change) * For multi-master clusters, set scheduler / controller-manager replica count to the number of masters (e.g. a 3-master cluster previously used replicas:2, now replicas:3)	2018-10-13 16:16:29 -07:00
Dalton Hubble	5a283b6443	Update etcd from v3.3.9 to v3.3.10 * https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.3.md#v3310-2018-10-10	2018-10-13 13:14:37 -07:00
Dalton Hubble	7653e511be	Update CoreDNS and Calico versions * Update CoreDNS from 1.1.3 to 1.2.2 * Update Calico from v3.2.1 to v3.2.3	2018-10-02 16:07:48 +02:00
Dalton Hubble	ad871dbfa9	Update Kubernetes from v1.11.2 to v1.11.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1113	2018-09-13 18:50:41 -07:00
Dalton Hubble	7eb09237f4	Update Calico from v3.1.3 to v3.2.1 * Add new bird and felix readiness checks * Read MTU from ConfigMap veth_mtu * Add RBAC read for serviceaccounts * Remove invalid description from CRDs	2018-08-25 17:53:11 -07:00
Dalton Hubble	e58b424882	Fix firewall to allow etcd client traffic between controllers * Broaden internal-etcd firewall rule to allow etcd client traffic (2379) from other controller nodes * Previously, kube-apiservers were only able to connect to their node's local etcd peer. While master node outages were tolerated, reaching a healthy peer took longer than neccessary in some cases * Reduce time needed to bootstrap a cluster	2018-08-21 23:51:40 -07:00
Dalton Hubble	b8eeafe4f9	Template etcd_servers list to replace null_resource.repeat * Remove the last usage of null_resource.repeat, which has always been an eyesore for creating the etcd server list * Originally, #224 switched to templating the etcd_servers list for all clouds, but had to revert on GCP in #237 * https://github.com/poseidon/typhoon/pull/224 * https://github.com/poseidon/typhoon/pull/237	2018-08-21 22:46:24 -07:00
Dalton Hubble	bdf1e6986e	Fix terraform fmt	2018-08-21 21:59:55 -07:00
Dalton Hubble	da5d2c5321	Remove GCP firewall rule allowing Nginx Ingress health * Nginx Ingress addon no longer uses hostNework so Prometheus may scrape port 10254 via the CNI network, rather than via the host address	2018-08-21 21:06:03 -07:00
Dalton Hubble	bceec9fdf5	Sort firewall / security rules and add comments * No functional changes to network firewalls	2018-08-21 20:53:16 -07:00
Dalton Hubble	f7ebdf475d	Update Kubernetes from v1.11.1 to v1.11.2 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1112	2018-08-07 21:57:25 -07:00
Dalton Hubble	edc250d62a	Fix Kublet version for Fedora Atomic modules * Release v1.11.1 erroneously left Fedora Atomic clusters using the v1.11.0 Kubelet. The rest of the control plane ran v1.11.1 as expected * Update Kubelet from v1.11.0 to v1.11.1 so Fedora Atomic matches Container Linux * Container Linux modules were not affected	2018-07-29 12:13:29 -07:00
Dalton Hubble	db64ce3312	Update etcd from v3.3.8 to v3.3.9 * https://github.com/coreos/etcd/blob/master/CHANGELOG-3.3.md#v339-2018-07-24	2018-07-29 11:27:37 -07:00
Dalton Hubble	7c327b8bf4	Update from bootkube v0.12.0 to v0.13.0	2018-07-29 11:20:17 -07:00
Dalton Hubble	d8d524d10b	Update Kubernetes from v1.11.0 to v1.11.1 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1111	2018-07-20 00:41:27 -07:00
Dalton Hubble	6f958d7577	Replace kube-dns with CoreDNS * Add system:coredns ClusterRole and binding * Annotate CoreDNS for Prometheus metrics scraping * Remove kube-dns deployment, service, & service account * https://github.com/poseidon/terraform-render-bootkube/pull/71 * https://kubernetes.io/blog/2018/06/27/kubernetes-1.11-release-announcement/	2018-07-01 22:55:01 -07:00
Dalton Hubble	fd1de27aef	Remove deprecated ingress_static_ip and controllers_ipv4_public outputs	2018-07-01 20:47:46 -07:00
Dalton Hubble	def445a344	Update Fedora Atomic kubelet from v1.10.5 to v1.11.0	2018-06-30 16:45:42 -07:00
Dalton Hubble	8464b258d8	Update Kubernetes from v1.10.5 to v1.11.0 * Force apiserver to stop listening on 127.0.0.1:8080 * Remove deprecated Kubelet `--allow-privileged`. Defaults to true. Use `PodSecurityPolicy` if limiting is desired * https://github.com/kubernetes/kubernetes/releases/tag/v1.11.0 * https://github.com/poseidon/terraform-render-bootkube/pull/68	2018-06-27 22:47:35 -07:00
Dalton Hubble	0c4d59db87	Use global HTTP/TCP proxy load balancing for Ingress on GCP * Switch Ingress from regional network load balancers to global HTTP/TCP Proxy load balancing * Reduce cost by ~$19/month per cluster. Google bills the first 5 global and regional forwarding rules separately. Typhoon clusters now use 3 global and 0 regional forwarding rules. * Worker pools no longer include an extraneous load balancer. Remove worker module's `ingress_static_ip` output. * Add `ingress_static_ipv4` output variable * Add `worker_instance_group` output to allow custom global load balancing * Deprecate `controllers_ipv4_public` module output * Deprecate `ingress_static_ip` module output. Use `ingress_static_ipv4`	2018-06-23 14:37:40 -07:00
Dalton Hubble	0227014fa0	Fix terraform formatting	2018-06-22 00:28:36 -07:00
Dalton Hubble	f4d3059b00	Update Kubernetes from v1.10.4 to v1.10.5 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.10.md#v1105	2018-06-21 22:51:39 -07:00
Dalton Hubble	6c5a1964aa	Change kube-apiserver port from 443 to 6443 * Adjust firewall rules, security groups, cloud load balancers, and generated kubeconfig's * Facilitates some future simplifications and cost reductions * Bare-Metal users who exposed kube-apiserver on a WAN via their router or load balancer will need to adjust its configuration. This is uncommon, most apiserver are on LAN and/or behind VPN so no routing infrastructure is configured with the port number	2018-06-19 23:48:51 -07:00
Dalton Hubble	6e64634748	Update etcd from v3.3.7 to v3.3.8 * https://github.com/coreos/etcd/releases/tag/v3.3.8	2018-06-19 21:56:21 -07:00
Dalton Hubble	51906bf398	Update etcd from v3.3.6 to v3.3.7	2018-06-14 22:46:16 -07:00
Dalton Hubble	6676484490	Partially revert b7ed6e7bd35cee39a3f65b47e731938c3006b5cd * Fix change that broke Google Cloud container-linux and fedora-atomic https://github.com/poseidon/typhoon/pull/224	2018-06-06 23:48:37 -07:00
Dalton Hubble	79260c48f6	Update Kubernetes from v1.10.3 to v1.10.4	2018-06-06 23:23:11 -07:00
Dalton Hubble	589c3569b7	Update etcd from v3.3.5 to v3.3.6 * https://github.com/coreos/etcd/releases/tag/v3.3.6	2018-06-06 23:19:30 -07:00
Dalton Hubble	6e968cd152	Update Calico from v3.1.2 to v3.1.3 * https://github.com/projectcalico/calico/releases/tag/v3.1.3 * https://github.com/projectcalico/cni-plugin/releases/tag/v3.1.3	2018-05-30 21:32:12 -07:00
Ben Drucker	6a581ab577	Render etcd_initial_cluster using a template_file	2018-05-30 21:14:49 -07:00
Dalton Hubble	4ea1fde9c5	Update Kubernetes from v1.10.2 to v1.10.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.10.md#v1103 * Update Calico from v3.1.1 to v3.1.2	2018-05-21 21:38:43 -07:00
Dalton Hubble	28d0891729	Annotate nginx-ingress addon for Prometheus auto-discovery * Add Google Cloud firewall rule to allow worker to worker access to health and metrics	2018-05-19 13:13:14 -07:00
William Zhang	2ae126bf68	Fix README link to tutorial	2018-05-19 13:10:22 -07:00
Dalton Hubble	c2b719dc75	Configure Prometheus to scrape Kubelets directly * Use Kubelet bearer token authn/authz to scrape metrics * Drop RBAC permission from nodes/proxy to nodes/metrics * Stop proxying kubelet scrapes through the apiserver, since this required higher privilege (nodes/proxy) and can add load to the apiserver on large clusters	2018-05-14 23:06:50 -07:00
Dalton Hubble	37981f9fb1	Allow bearer token authn/authz to the Kubelet * Require Webhook authorization to the Kubelet * Switch apiserver X509 client cert org to systems:masters to grant the apiserver admin and satisfy the authorization requirement. kubectl commands like logs or exec that have the apiserver make requests of a kubelet continue to work as before * https://kubernetes.io/docs/admin/kubelet-authentication-authorization/ * https://github.com/poseidon/typhoon/issues/215	2018-05-13 23:20:42 -07:00
Dalton Hubble	f2ee75ac98	Require Terraform v0.11.x, drop v0.10.x support * Raise minimum Terraform version to v0.11.0 * Terraform v0.11.x has been supported since Typhoon v1.9.2 and Terraform v0.10.x was last released in Nov 2017. I'd like to stop worrying about v0.10.x and remove migration docs as a later followup * Migration docs docs/topics/maintenance.md#terraform-v011x	2018-05-10 02:20:46 -07:00
Dalton Hubble	8b8e364915	Update etcd from v3.3.4 to v3.3.5 * https://github.com/coreos/etcd/releases/tag/v3.3.5	2018-05-10 02:12:53 -07:00
Dalton Hubble	9d4cbb38f6	Rerun terraform fmt	2018-05-01 21:41:22 -07:00
Dalton Hubble	e889430926	Update kube-dns from v1.14.9 to v1.14.10 * https://github.com/kubernetes/kubernetes/pull/62676	2018-04-28 00:43:09 -07:00
Dalton Hubble	32ddfa94e1	Update Kubernetes from v1.10.1 to v1.10.2 * https://github.com/kubernetes/kubernetes/releases/tag/v1.10.2	2018-04-28 00:27:00 -07:00
Dalton Hubble	681450aa0d	Update etcd from v3.3.3 to v3.3.4 * https://github.com/coreos/etcd/releases/tag/v3.3.4	2018-04-27 23:57:26 -07:00
Dalton Hubble	567e18f015	Fix conflict between Calico and NetworkManager * Observed frequent kube-scheduler and controller-manager restarts with Calico as the CNI provider. Root cause was unclear since control plane was functional and tests of pod to pod network connectivity passed * Root cause: Calico sets up cali* and tunl* network interfaces for containers on hosts. NetworkManager tries to manage these interfaces. It periodically disconnected veth pairs. Logs did not surface this issue since its not an error per-se, just Calico and NetworkManager dueling for control. Kubernetes correctly restarted pods failing health checks and ensured 2 replicas were running so the control plane functioned mostly normally. Pod to pod connecitivity was only affected occassionally. Pain to debug. * Solution: Configure NetworkManager to ignore the Calico ifaces per Calico's recommendation. Cloud-init writes files after NetworkManager starts, so a restart is required on first boot. On subsequent boots, the file is present so no restart is needed	2018-04-25 21:45:58 -07:00
Dalton Hubble	0a7fab56e2	Load ip_vs kernel module on boot as workaround * (containerized) kube-proxy warns that it is unable to load the ip_vs kernel module despite having the correct mounts. Atomic uses an xz compressed module and modprobe in the container was not compiled with compression support * Workaround issue for now by always loading ip_vs on-host * https://github.com/kubernetes/kubernetes/issues/60	2018-04-25 21:45:58 -07:00
Dalton Hubble	d784b0fca6	Switch to quay.io/poseidon tagged system containers	2018-04-25 18:15:18 -07:00
Dalton Hubble	7198b9016c	Update Calico from v3.0.4 to v3.1.1 for Atomic	2018-04-21 18:46:56 -07:00
Dalton Hubble	9b88d4bbfd	Use bootkube system container on fedora-atomic * Use the upstream bootkube image packaged with the required metadata to be usable as a system container under systemd * Run bootkube with runc so no host level components use Docker any more. Docker is still the runtime * Remove bootkube script and old systemd unit	2018-04-21 18:46:56 -07:00
Dalton Hubble	3dde4ba8ba	Mount host's /etc/os-release in kubelet system containers * Fix `kubectl describe node` to reflect the host's operating system	2018-04-21 18:46:56 -07:00
Dalton Hubble	e148552220	Enable kubelet allocatable enforcement and QoS cgroup hierarchy * Change kubelet system image to use --cgroups-per-qos=true (default) instead of false * Change kubelet system image to use --enforce-node-allocatable=pods instead of an empty string	2018-04-21 18:46:56 -07:00
Dalton Hubble	d8d1468f03	Update kubelet system container image to mount /etc/hosts * Fix kubelet port-forward on Google Cloud / Fedora Atomic * Mount the host's /etc/hosts in kubelet system containers * Problem: kubelet runc system containers on Atomic were not mounting the host's /etc/hosts, like rkt-fly does on Container Linux. `kubectl port-forward` calls socat with localhost. DNS servers on AWS, DO, and in many bare-metal environments resolve localhost to the caller as a convenience. Google Cloud notably does not nor is it required to do so and this surfaced the missing /etc/hosts in runc kubelet namespaces.	2018-04-21 18:46:56 -07:00
Dalton Hubble	2b74aba564	Add Google Cloud fedora-atomic module * Network load balancer for ingress doesn't work yet because Compute Engine packages are missing * port-forward / socat is broken	2018-04-21 18:46:56 -07:00

1 2 3 4 5 ...

258 Commits