typhoon

mirror of https://github.com/puppetmaster/typhoon.git synced 2025-10-04 04:24:37 +02:00

Author	SHA1	Message	Date
Dalton Hubble	d55bfd5589	Fix CoreDNS AntiAffinity spec to prefer spreading replicas * Pods were still being scheduled at random due to a typo	2018-10-17 22:19:57 -07:00
Dalton Hubble	9b6113a058	Update Kubernetes from v1.11.3 to v1.12.1 * Mount an empty dir for the controller-manager to work around https://github.com/kubernetes/kubernetes/issues/68973 * Update coreos/pod-checkpointer to strip affinity from checkpointed pod manifests. Kubernetes v1.12.0-rc.1 introduced a default affinity that appears on checkpointed manifests; but it prevented scheduling and checkpointed pods should not have an affinity, they're run directly by the Kubelet on the local node * https://github.com/kubernetes-incubator/bootkube/issues/1001 * https://github.com/kubernetes/kubernetes/pull/68173	2018-10-16 20:28:13 -07:00
Dalton Hubble	5eb4078d68	Add docker/default seccomp to control plane and addons * Annotate pods, deployments, and daemonsets to start containers with the Docker runtime's default seccomp profile * Overrides Kubernetes default behavior which started containers with seccomp=unconfined * https://docs.docker.com/engine/security/seccomp/#pass-a-profile-for-a-container	2018-10-16 20:07:29 -07:00
Dalton Hubble	55bb4dfba6	Raise CoreDNS replica count to 2 or more * Run at least two replicas of CoreDNS to better support rolling updates (previously, kube-dns had a pod nanny) * On multi-master clusters, set the CoreDNS replica count to match the number of masters (e.g. a 3-master cluster previously used replicas:1, now replicas:3) * Add AntiAffinity preferred rule to favor distributing CoreDNS pods across controller nodes nodes	2018-10-13 20:31:29 -07:00
Dalton Hubble	43fe78a2cc	Raise scheduler/controller-manager replicas in multi-master * Continue to ensure scheduler and controller-manager run at least two replicas to support performing kubectl edits on single-master clusters (no change) * For multi-master clusters, set scheduler / controller-manager replica count to the number of masters (e.g. a 3-master cluster previously used replicas:2, now replicas:3)	2018-10-13 16:16:29 -07:00
Dalton Hubble	5a283b6443	Update etcd from v3.3.9 to v3.3.10 * https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.3.md#v3310-2018-10-10	2018-10-13 13:14:37 -07:00
Dalton Hubble	7653e511be	Update CoreDNS and Calico versions * Update CoreDNS from 1.1.3 to 1.2.2 * Update Calico from v3.2.1 to v3.2.3	2018-10-02 16:07:48 +02:00
Dalton Hubble	ad871dbfa9	Update Kubernetes from v1.11.2 to v1.11.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1113	2018-09-13 18:50:41 -07:00
Dalton Hubble	7eb09237f4	Update Calico from v3.1.3 to v3.2.1 * Add new bird and felix readiness checks * Read MTU from ConfigMap veth_mtu * Add RBAC read for serviceaccounts * Remove invalid description from CRDs	2018-08-25 17:53:11 -07:00
Dalton Hubble	bdf1e6986e	Fix terraform fmt	2018-08-21 21:59:55 -07:00
Dalton Hubble	f7ebdf475d	Update Kubernetes from v1.11.1 to v1.11.2 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1112	2018-08-07 21:57:25 -07:00
Dalton Hubble	edc250d62a	Fix Kublet version for Fedora Atomic modules * Release v1.11.1 erroneously left Fedora Atomic clusters using the v1.11.0 Kubelet. The rest of the control plane ran v1.11.1 as expected * Update Kubelet from v1.11.0 to v1.11.1 so Fedora Atomic matches Container Linux * Container Linux modules were not affected	2018-07-29 12:13:29 -07:00
Dalton Hubble	db64ce3312	Update etcd from v3.3.8 to v3.3.9 * https://github.com/coreos/etcd/blob/master/CHANGELOG-3.3.md#v339-2018-07-24	2018-07-29 11:27:37 -07:00
Dalton Hubble	7c327b8bf4	Update from bootkube v0.12.0 to v0.13.0	2018-07-29 11:20:17 -07:00
Dalton Hubble	13beb13aab	Add descriptions to bare-metal fedora-atomic variables	2018-07-29 11:07:48 -07:00
Dalton Hubble	d8d524d10b	Update Kubernetes from v1.11.0 to v1.11.1 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1111	2018-07-20 00:41:27 -07:00
Dalton Hubble	915f89d3c8	Update Fedora Atomic from 27 to 28 on bare-metal	2018-07-04 11:41:54 -07:00
Dalton Hubble	6f958d7577	Replace kube-dns with CoreDNS * Add system:coredns ClusterRole and binding * Annotate CoreDNS for Prometheus metrics scraping * Remove kube-dns deployment, service, & service account * https://github.com/poseidon/terraform-render-bootkube/pull/71 * https://kubernetes.io/blog/2018/06/27/kubernetes-1.11-release-announcement/	2018-07-01 22:55:01 -07:00
Dalton Hubble	def445a344	Update Fedora Atomic kubelet from v1.10.5 to v1.11.0	2018-06-30 16:45:42 -07:00
Dalton Hubble	8464b258d8	Update Kubernetes from v1.10.5 to v1.11.0 * Force apiserver to stop listening on 127.0.0.1:8080 * Remove deprecated Kubelet `--allow-privileged`. Defaults to true. Use `PodSecurityPolicy` if limiting is desired * https://github.com/kubernetes/kubernetes/releases/tag/v1.11.0 * https://github.com/poseidon/terraform-render-bootkube/pull/68	2018-06-27 22:47:35 -07:00
Dalton Hubble	f4d3059b00	Update Kubernetes from v1.10.4 to v1.10.5 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.10.md#v1105	2018-06-21 22:51:39 -07:00
Dalton Hubble	6c5a1964aa	Change kube-apiserver port from 443 to 6443 * Adjust firewall rules, security groups, cloud load balancers, and generated kubeconfig's * Facilitates some future simplifications and cost reductions * Bare-Metal users who exposed kube-apiserver on a WAN via their router or load balancer will need to adjust its configuration. This is uncommon, most apiserver are on LAN and/or behind VPN so no routing infrastructure is configured with the port number	2018-06-19 23:48:51 -07:00
Dalton Hubble	6e64634748	Update etcd from v3.3.7 to v3.3.8 * https://github.com/coreos/etcd/releases/tag/v3.3.8	2018-06-19 21:56:21 -07:00
Dalton Hubble	ed0b781296	Fix possible deadlock for provisioning bare-metal clusters * Closes #235	2018-06-14 23:15:28 -07:00
Dalton Hubble	51906bf398	Update etcd from v3.3.6 to v3.3.7	2018-06-14 22:46:16 -07:00
Dalton Hubble	79260c48f6	Update Kubernetes from v1.10.3 to v1.10.4	2018-06-06 23:23:11 -07:00
Dalton Hubble	589c3569b7	Update etcd from v3.3.5 to v3.3.6 * https://github.com/coreos/etcd/releases/tag/v3.3.6	2018-06-06 23:19:30 -07:00
Dalton Hubble	6e968cd152	Update Calico from v3.1.2 to v3.1.3 * https://github.com/projectcalico/calico/releases/tag/v3.1.3 * https://github.com/projectcalico/cni-plugin/releases/tag/v3.1.3	2018-05-30 21:32:12 -07:00
Dalton Hubble	4ea1fde9c5	Update Kubernetes from v1.10.2 to v1.10.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.10.md#v1103 * Update Calico from v3.1.1 to v3.1.2	2018-05-21 21:38:43 -07:00
William Zhang	2ae126bf68	Fix README link to tutorial	2018-05-19 13:10:22 -07:00
Dalton Hubble	37981f9fb1	Allow bearer token authn/authz to the Kubelet * Require Webhook authorization to the Kubelet * Switch apiserver X509 client cert org to systems:masters to grant the apiserver admin and satisfy the authorization requirement. kubectl commands like logs or exec that have the apiserver make requests of a kubelet continue to work as before * https://kubernetes.io/docs/admin/kubelet-authentication-authorization/ * https://github.com/poseidon/typhoon/issues/215	2018-05-13 23:20:42 -07:00
Dalton Hubble	f2ee75ac98	Require Terraform v0.11.x, drop v0.10.x support * Raise minimum Terraform version to v0.11.0 * Terraform v0.11.x has been supported since Typhoon v1.9.2 and Terraform v0.10.x was last released in Nov 2017. I'd like to stop worrying about v0.10.x and remove migration docs as a later followup * Migration docs docs/topics/maintenance.md#terraform-v011x	2018-05-10 02:20:46 -07:00
Dalton Hubble	8b8e364915	Update etcd from v3.3.4 to v3.3.5 * https://github.com/coreos/etcd/releases/tag/v3.3.5	2018-05-10 02:12:53 -07:00
Dalton Hubble	9d4cbb38f6	Rerun terraform fmt	2018-05-01 21:41:22 -07:00
Dalton Hubble	e889430926	Update kube-dns from v1.14.9 to v1.14.10 * https://github.com/kubernetes/kubernetes/pull/62676	2018-04-28 00:43:09 -07:00
Dalton Hubble	32ddfa94e1	Update Kubernetes from v1.10.1 to v1.10.2 * https://github.com/kubernetes/kubernetes/releases/tag/v1.10.2	2018-04-28 00:27:00 -07:00
Dalton Hubble	681450aa0d	Update etcd from v3.3.3 to v3.3.4 * https://github.com/coreos/etcd/releases/tag/v3.3.4	2018-04-27 23:57:26 -07:00
Dalton Hubble	567e18f015	Fix conflict between Calico and NetworkManager * Observed frequent kube-scheduler and controller-manager restarts with Calico as the CNI provider. Root cause was unclear since control plane was functional and tests of pod to pod network connectivity passed * Root cause: Calico sets up cali* and tunl* network interfaces for containers on hosts. NetworkManager tries to manage these interfaces. It periodically disconnected veth pairs. Logs did not surface this issue since its not an error per-se, just Calico and NetworkManager dueling for control. Kubernetes correctly restarted pods failing health checks and ensured 2 replicas were running so the control plane functioned mostly normally. Pod to pod connecitivity was only affected occassionally. Pain to debug. * Solution: Configure NetworkManager to ignore the Calico ifaces per Calico's recommendation. Cloud-init writes files after NetworkManager starts, so a restart is required on first boot. On subsequent boots, the file is present so no restart is needed	2018-04-25 21:45:58 -07:00
Dalton Hubble	0a7fab56e2	Load ip_vs kernel module on boot as workaround * (containerized) kube-proxy warns that it is unable to load the ip_vs kernel module despite having the correct mounts. Atomic uses an xz compressed module and modprobe in the container was not compiled with compression support * Workaround issue for now by always loading ip_vs on-host * https://github.com/kubernetes/kubernetes/issues/60	2018-04-25 21:45:58 -07:00
Dalton Hubble	d784b0fca6	Switch to quay.io/poseidon tagged system containers	2018-04-25 18:15:18 -07:00
Dalton Hubble	7198b9016c	Update Calico from v3.0.4 to v3.1.1 for Atomic	2018-04-21 18:46:56 -07:00
Dalton Hubble	f36c890234	Fix ostree repo to be called fedora-atomic on bare-metal * atomic host updates were fetching updates from the repo cache fedora-atomic-27, instead of from upstream	2018-04-21 18:46:56 -07:00
Dalton Hubble	3f2978821b	Add atomic_assets_endpoint var for fedora-atomic bare-metal	2018-04-21 18:46:56 -07:00
Dalton Hubble	9b88d4bbfd	Use bootkube system container on fedora-atomic * Use the upstream bootkube image packaged with the required metadata to be usable as a system container under systemd * Run bootkube with runc so no host level components use Docker any more. Docker is still the runtime * Remove bootkube script and old systemd unit	2018-04-21 18:46:56 -07:00
Dalton Hubble	3dde4ba8ba	Mount host's /etc/os-release in kubelet system containers * Fix `kubectl describe node` to reflect the host's operating system	2018-04-21 18:46:56 -07:00
Dalton Hubble	e148552220	Enable kubelet allocatable enforcement and QoS cgroup hierarchy * Change kubelet system image to use --cgroups-per-qos=true (default) instead of false * Change kubelet system image to use --enforce-node-allocatable=pods instead of an empty string	2018-04-21 18:46:56 -07:00
Dalton Hubble	d8d1468f03	Update kubelet system container image to mount /etc/hosts * Fix kubelet port-forward on Google Cloud / Fedora Atomic * Mount the host's /etc/hosts in kubelet system containers * Problem: kubelet runc system containers on Atomic were not mounting the host's /etc/hosts, like rkt-fly does on Container Linux. `kubectl port-forward` calls socat with localhost. DNS servers on AWS, DO, and in many bare-metal environments resolve localhost to the caller as a convenience. Google Cloud notably does not nor is it required to do so and this surfaced the missing /etc/hosts in runc kubelet namespaces.	2018-04-21 18:46:56 -07:00
Dalton Hubble	cf22e70b46	Name ostree remote repo fedora-atomic across platforms	2018-04-21 18:46:56 -07:00
Dalton Hubble	b3cf9508b6	Update Fedora Atomic modules to Kubernetes v1.10.1	2018-04-21 18:46:56 -07:00
Dalton Hubble	f990473cde	Update control plane manifests and add etcd metrics * Enable etcd v3.3 metrics to expose metrics for scraping by Prometheus * Use k8s.gcr.io instead of gcr.io/google_containers * Add flexvolume plugin mount to controller manager * Update kube-dns from v1.14.8 to v1.14.9	2018-04-21 18:46:56 -07:00

1 2

54 Commits