typhoon

mirror of https://github.com/puppetmaster/typhoon.git synced 2025-10-04 04:24:37 +02:00

Author	SHA1	Message	Date
Dalton Hubble	c2b719dc75	Configure Prometheus to scrape Kubelets directly * Use Kubelet bearer token authn/authz to scrape metrics * Drop RBAC permission from nodes/proxy to nodes/metrics * Stop proxying kubelet scrapes through the apiserver, since this required higher privilege (nodes/proxy) and can add load to the apiserver on large clusters	2018-05-14 23:06:50 -07:00
Dalton Hubble	37981f9fb1	Allow bearer token authn/authz to the Kubelet * Require Webhook authorization to the Kubelet * Switch apiserver X509 client cert org to systems:masters to grant the apiserver admin and satisfy the authorization requirement. kubectl commands like logs or exec that have the apiserver make requests of a kubelet continue to work as before * https://kubernetes.io/docs/admin/kubelet-authentication-authorization/ * https://github.com/poseidon/typhoon/issues/215	2018-05-13 23:20:42 -07:00
Dalton Hubble	5eb11f5104	Allow Flatcar Linux os_image on AWS, rename os_channel * Replace os_channel variable with os_image to align naming across clouds. Users who set this option to stable, beta, or alpha should now set os_image to coreos-stable, coreos-beta, or coreos-alpha. * Default os_image to coreos-stable. This continues to use the most recent image from the stable channel as always. * Allow Container Linux derivative Flatcar Linux by setting os_image to `flatcar-stable`, `flatcar-beta`, `flatcar-alpha`	2018-05-12 11:41:58 -07:00
Dalton Hubble	f2ee75ac98	Require Terraform v0.11.x, drop v0.10.x support * Raise minimum Terraform version to v0.11.0 * Terraform v0.11.x has been supported since Typhoon v1.9.2 and Terraform v0.10.x was last released in Nov 2017. I'd like to stop worrying about v0.10.x and remove migration docs as a later followup * Migration docs docs/topics/maintenance.md#terraform-v011x	2018-05-10 02:20:46 -07:00
Dalton Hubble	8b8e364915	Update etcd from v3.3.4 to v3.3.5 * https://github.com/coreos/etcd/releases/tag/v3.3.5	2018-05-10 02:12:53 -07:00
Michael Holt	a5916da0e2	Update min AWS provider from v1.11 to v1.13	2018-05-02 15:16:03 -07:00
Dalton Hubble	cc29530ba0	Allow preemptible workers on AWS via spot instances * Add `worker_price` to allow worker spot instances. Defaults to empty string for the worker autoscaling group to use regular on-demand instances. * Add `spot_price` to internal `workers` module for spot worker pools * Note: Unlike GCP `preemptible` workers, spot instances require you to pick a bid price.	2018-04-29 13:31:17 -07:00
Dalton Hubble	e889430926	Update kube-dns from v1.14.9 to v1.14.10 * https://github.com/kubernetes/kubernetes/pull/62676	2018-04-28 00:43:09 -07:00
Dalton Hubble	32ddfa94e1	Update Kubernetes from v1.10.1 to v1.10.2 * https://github.com/kubernetes/kubernetes/releases/tag/v1.10.2	2018-04-28 00:27:00 -07:00
Dalton Hubble	681450aa0d	Update etcd from v3.3.3 to v3.3.4 * https://github.com/coreos/etcd/releases/tag/v3.3.4	2018-04-27 23:57:26 -07:00
Dalton Hubble	567e18f015	Fix conflict between Calico and NetworkManager * Observed frequent kube-scheduler and controller-manager restarts with Calico as the CNI provider. Root cause was unclear since control plane was functional and tests of pod to pod network connectivity passed * Root cause: Calico sets up cali* and tunl* network interfaces for containers on hosts. NetworkManager tries to manage these interfaces. It periodically disconnected veth pairs. Logs did not surface this issue since its not an error per-se, just Calico and NetworkManager dueling for control. Kubernetes correctly restarted pods failing health checks and ensured 2 replicas were running so the control plane functioned mostly normally. Pod to pod connecitivity was only affected occassionally. Pain to debug. * Solution: Configure NetworkManager to ignore the Calico ifaces per Calico's recommendation. Cloud-init writes files after NetworkManager starts, so a restart is required on first boot. On subsequent boots, the file is present so no restart is needed	2018-04-25 21:45:58 -07:00
Dalton Hubble	0a7fab56e2	Load ip_vs kernel module on boot as workaround * (containerized) kube-proxy warns that it is unable to load the ip_vs kernel module despite having the correct mounts. Atomic uses an xz compressed module and modprobe in the container was not compiled with compression support * Workaround issue for now by always loading ip_vs on-host * https://github.com/kubernetes/kubernetes/issues/60	2018-04-25 21:45:58 -07:00
Dalton Hubble	d784b0fca6	Switch to quay.io/poseidon tagged system containers	2018-04-25 18:15:18 -07:00
Dalton Hubble	7198b9016c	Update Calico from v3.0.4 to v3.1.1 for Atomic	2018-04-21 18:46:56 -07:00
Dalton Hubble	233ec6dcb0	Update Fedora Atomic AMI to version 27.122 * http://www.projectatomic.io/blog/2018/04/fedora-atomic-20-apr-18/ * Atomic publishes nightly AMIs which sometimes don't boot or have issues. Until there is a source of reliable AMIs, pin the best known working AMI * Rel 66a66f0d18544591ffdbf8fae9df790113c93d72	2018-04-21 18:46:56 -07:00
Dalton Hubble	9b88d4bbfd	Use bootkube system container on fedora-atomic * Use the upstream bootkube image packaged with the required metadata to be usable as a system container under systemd * Run bootkube with runc so no host level components use Docker any more. Docker is still the runtime * Remove bootkube script and old systemd unit	2018-04-21 18:46:56 -07:00
Dalton Hubble	3dde4ba8ba	Mount host's /etc/os-release in kubelet system containers * Fix `kubectl describe node` to reflect the host's operating system	2018-04-21 18:46:56 -07:00
Dalton Hubble	e148552220	Enable kubelet allocatable enforcement and QoS cgroup hierarchy * Change kubelet system image to use --cgroups-per-qos=true (default) instead of false * Change kubelet system image to use --enforce-node-allocatable=pods instead of an empty string	2018-04-21 18:46:56 -07:00
Dalton Hubble	d8d1468f03	Update kubelet system container image to mount /etc/hosts * Fix kubelet port-forward on Google Cloud / Fedora Atomic * Mount the host's /etc/hosts in kubelet system containers * Problem: kubelet runc system containers on Atomic were not mounting the host's /etc/hosts, like rkt-fly does on Container Linux. `kubectl port-forward` calls socat with localhost. DNS servers on AWS, DO, and in many bare-metal environments resolve localhost to the caller as a convenience. Google Cloud notably does not nor is it required to do so and this surfaced the missing /etc/hosts in runc kubelet namespaces.	2018-04-21 18:46:56 -07:00
Dalton Hubble	24d230505a	Add cloud-metadata.service on AWS fedora-atomic	2018-04-21 18:46:56 -07:00
Dalton Hubble	b3cf9508b6	Update Fedora Atomic modules to Kubernetes v1.10.1	2018-04-21 18:46:56 -07:00
Dalton Hubble	5212684472	Temporarily pin Fedora Atomic AMI * Atomic has published AMI images that shutdown immediately after being powered on	2018-04-21 18:46:56 -07:00
Dalton Hubble	f990473cde	Update control plane manifests and add etcd metrics * Enable etcd v3.3 metrics to expose metrics for scraping by Prometheus * Use k8s.gcr.io instead of gcr.io/google_containers * Add flexvolume plugin mount to controller manager * Update kube-dns from v1.14.8 to v1.14.9	2018-04-21 18:46:56 -07:00
Dalton Hubble	8523a086e2	Fix kubelet system container to mount CNI plugins * Mount /opt/cni/bin in kubelet system container so CNI plugin binaries can be found. Before, flannel worked because the kubelet falls back to flannel plugin baked into the hyperkube (undesired) * Move the CNI bin install location later, since /opt changes may be lost between ostree rebases	2018-04-21 18:46:56 -07:00
Dalton Hubble	19bc5aea9e	Use kubelet system container on fedora-atomic * Use the upstream hyperkube image packaged with the required metadata to be usable as a system container under systemd * Fix port-forward since socat is included	2018-04-21 18:46:56 -07:00
Dalton Hubble	8d7cfc1a45	Use etcd system container on fedora-atomic * Use the upstream etcd image packaged with the required metadata to be usable as a system container (runc) under systemd	2018-04-21 18:46:56 -07:00
Dalton Hubble	9969c357da	Change AWS Fedora module to fedora-atomic	2018-04-21 18:46:56 -07:00
Dalton Hubble	b80a2eb8a0	Sync fedora-cloud modules with Container Linux * Update manifests for Kubernetes v1.10.0 * Update etcd from v3.3.2 to v3.3.3 * Add disk_type optional variable on AWS * Remove redundant kubeconfig copy on AWS * Distribute etcd secres only to controllers * Organize module variables and ssh steps	2018-04-21 18:46:56 -07:00
Dalton Hubble	3610da8b71	Add fedora-cloud module for AWS	2018-04-21 18:46:56 -07:00
Dalton Hubble	a54f76db2a	Update Calico from v3.0.4 to v3.1.1 * https://github.com/projectcalico/calico/releases/tag/v3.1.1 * https://github.com/projectcalico/calico/releases/tag/v3.1.0	2018-04-21 18:30:36 -07:00
Dalton Hubble	23a8156bdf	Fix a few typos in comments	2018-04-15 17:21:49 -07:00
Dalton Hubble	77c0a4cf2e	Update Kubernetes from v1.10.0 to v1.10.1 * Use kubernetes-incubator/bootkube v0.12.0	2018-04-12 20:57:31 -07:00
Dalton Hubble	9bb3de5327	Skip creating unused dirs on worker nodes	2018-04-11 22:23:51 -07:00
Dalton Hubble	6b08bde479	Use k8s.gcr.io instead of gcr.io/google_containers * Kubernetes recommends using the alias to fetch images from the nearest GCR regional mirror, to abstract the use of GCR, and to drop names containing 'google' * https://groups.google.com/forum/#!msg/kubernetes-dev/ytjk_rNrTa0/3EFUHvovCAAJ	2018-04-08 12:57:52 -07:00
Dalton Hubble	f4b2396718	Return Prometheus deployment to be a worker workload * Expose etcd metrics to workers so Prometheus can run on a worker, rather than a controller * Drop temporary firewall rules allowing Prometheus to run on a controller and scrape targes * Related to https://github.com/poseidon/typhoon/pull/175	2018-04-08 12:20:00 -07:00
Dalton Hubble	18dbaf74ce	Update kube-dns from v1.14.8 to v1.14.9 * https://github.com/kubernetes/kubernetes/pull/61908	2018-04-04 21:00:23 -07:00
Dalton Hubble	ce001e9d56	Update etcd from v3.3.2 to v3.3.3 * https://github.com/coreos/etcd/releases/tag/v3.3.3	2018-04-04 20:32:24 -07:00
Dalton Hubble	d770393dbc	Add etcd metrics, Prometheus scrapes, and Grafana dash * Use etcd v3.3 --listen-metrics-urls to expose only metrics data via http://0.0.0.0:2381 on controllers * Add Prometheus discovery for etcd peers on controller nodes * Temporarily drop two noisy Prometheus alerts	2018-04-03 20:31:00 -07:00
Dalton Hubble	1cc043d1eb	Update Kubernetes from v1.9.6 to v1.10.0	2018-03-30 22:14:07 -07:00
Dalton Hubble	f8e9bfb1c0	Add disk_type variable for EBS volume type on AWS * Change EBS volume type from `standard` ("prior generation) to `gp2`. Prometheus alerts are tuned for SSDs * Other platforms have fast enough disks by default	2018-03-29 22:51:54 -07:00
Dalton Hubble	7acd4931f6	Remove redundant kubeconfig copy on AWS and GCP * AWS and Google Cloud make use of auto-scaling groups and managed instance groups, respectively. As such, the kubeconfig is already held in cloud user-data * Controller instances are provisioned with a kubeconfig from user-data. Its redundant to use a Terraform remote file copy step for the kubeconfig.	2018-03-26 00:01:47 -07:00
Dalton Hubble	e43cf9f608	Organize and cleanup variable descriptions	2018-03-25 21:44:43 -07:00
Dalton Hubble	a04ef3919a	Update Kubernetes from v1.9.5 to v1.9.6	2018-03-21 20:29:52 -07:00
Dalton Hubble	758c09fa5c	Update Kubernetes from v1.9.4 to v1.9.5	2018-03-19 00:25:44 -07:00
Dalton Hubble	f3730b2bfa	Add Container Linux Config snippets feature * Introduce the ability to support Container Linux Config "snippets" for controllers and workers on cloud platforms. This allows end-users to customize hosts by providing Container Linux configs that are additively merged into the base configs defined by Typhoon. Config snippets are validated, merged, and show any errors during `terraform plan` * Example uses include adding systemd units, network configs, mounts, files, raid arrays, or other disk provisioning features provided by Container Linux Configs (using Ignition low-level) * Requires terraform-provider-ct v0.2.1 plugin	2018-03-18 18:28:18 -07:00
Dalton Hubble	88aa9a46e5	Add /var/lib/calico volume mount to Calico DaemonSet	2018-03-18 16:40:38 -07:00
Dalton Hubble	efa90d8b44	Add a new key=value label to controller nodes * Add a node-role.kubernetes.io/controller="true" node label to controllers so Prometheus service discovery can filter to services that only run on controllers (i.e. masters) * Leave node-role.kubernetes.io/master="" untouched as its a Kubernetes convention	2018-03-18 16:39:10 -07:00
Dalton Hubble	21f2cef12f	Improve changelog, README, and index page	2018-03-12 20:58:02 -07:00
Dalton Hubble	931e311786	Update Kubernetes from v1.9.3 to v1.9.4	2018-03-12 18:07:50 -07:00
Dalton Hubble	8e7e6b9f7f	Normalize Terraform configs with terraform fmt	2018-03-11 14:46:05 -07:00

1 2 3

115 Commits