typhoon

mirror of https://github.com/puppetmaster/typhoon.git synced 2025-10-04 17:04:38 +02:00

Author	SHA1	Message	Date
Michael Holt	a5916da0e2	Update min AWS provider from v1.11 to v1.13	2018-05-02 15:16:03 -07:00
Dalton Hubble	cc29530ba0	Allow preemptible workers on AWS via spot instances * Add `worker_price` to allow worker spot instances. Defaults to empty string for the worker autoscaling group to use regular on-demand instances. * Add `spot_price` to internal `workers` module for spot worker pools * Note: Unlike GCP `preemptible` workers, spot instances require you to pick a bid price.	2018-04-29 13:31:17 -07:00
Dalton Hubble	e889430926	Update kube-dns from v1.14.9 to v1.14.10 * https://github.com/kubernetes/kubernetes/pull/62676	2018-04-28 00:43:09 -07:00
Dalton Hubble	32ddfa94e1	Update Kubernetes from v1.10.1 to v1.10.2 * https://github.com/kubernetes/kubernetes/releases/tag/v1.10.2	2018-04-28 00:27:00 -07:00
Dalton Hubble	681450aa0d	Update etcd from v3.3.3 to v3.3.4 * https://github.com/coreos/etcd/releases/tag/v3.3.4	2018-04-27 23:57:26 -07:00
Dalton Hubble	567e18f015	Fix conflict between Calico and NetworkManager * Observed frequent kube-scheduler and controller-manager restarts with Calico as the CNI provider. Root cause was unclear since control plane was functional and tests of pod to pod network connectivity passed * Root cause: Calico sets up cali* and tunl* network interfaces for containers on hosts. NetworkManager tries to manage these interfaces. It periodically disconnected veth pairs. Logs did not surface this issue since its not an error per-se, just Calico and NetworkManager dueling for control. Kubernetes correctly restarted pods failing health checks and ensured 2 replicas were running so the control plane functioned mostly normally. Pod to pod connecitivity was only affected occassionally. Pain to debug. * Solution: Configure NetworkManager to ignore the Calico ifaces per Calico's recommendation. Cloud-init writes files after NetworkManager starts, so a restart is required on first boot. On subsequent boots, the file is present so no restart is needed	2018-04-25 21:45:58 -07:00
Dalton Hubble	0a7fab56e2	Load ip_vs kernel module on boot as workaround * (containerized) kube-proxy warns that it is unable to load the ip_vs kernel module despite having the correct mounts. Atomic uses an xz compressed module and modprobe in the container was not compiled with compression support * Workaround issue for now by always loading ip_vs on-host * https://github.com/kubernetes/kubernetes/issues/60	2018-04-25 21:45:58 -07:00
Dalton Hubble	d784b0fca6	Switch to quay.io/poseidon tagged system containers	2018-04-25 18:15:18 -07:00
Dalton Hubble	7198b9016c	Update Calico from v3.0.4 to v3.1.1 for Atomic	2018-04-21 18:46:56 -07:00
Dalton Hubble	233ec6dcb0	Update Fedora Atomic AMI to version 27.122 * http://www.projectatomic.io/blog/2018/04/fedora-atomic-20-apr-18/ * Atomic publishes nightly AMIs which sometimes don't boot or have issues. Until there is a source of reliable AMIs, pin the best known working AMI * Rel 66a66f0d18544591ffdbf8fae9df790113c93d72	2018-04-21 18:46:56 -07:00
Dalton Hubble	9b88d4bbfd	Use bootkube system container on fedora-atomic * Use the upstream bootkube image packaged with the required metadata to be usable as a system container under systemd * Run bootkube with runc so no host level components use Docker any more. Docker is still the runtime * Remove bootkube script and old systemd unit	2018-04-21 18:46:56 -07:00
Dalton Hubble	3dde4ba8ba	Mount host's /etc/os-release in kubelet system containers * Fix `kubectl describe node` to reflect the host's operating system	2018-04-21 18:46:56 -07:00
Dalton Hubble	e148552220	Enable kubelet allocatable enforcement and QoS cgroup hierarchy * Change kubelet system image to use --cgroups-per-qos=true (default) instead of false * Change kubelet system image to use --enforce-node-allocatable=pods instead of an empty string	2018-04-21 18:46:56 -07:00
Dalton Hubble	d8d1468f03	Update kubelet system container image to mount /etc/hosts * Fix kubelet port-forward on Google Cloud / Fedora Atomic * Mount the host's /etc/hosts in kubelet system containers * Problem: kubelet runc system containers on Atomic were not mounting the host's /etc/hosts, like rkt-fly does on Container Linux. `kubectl port-forward` calls socat with localhost. DNS servers on AWS, DO, and in many bare-metal environments resolve localhost to the caller as a convenience. Google Cloud notably does not nor is it required to do so and this surfaced the missing /etc/hosts in runc kubelet namespaces.	2018-04-21 18:46:56 -07:00
Dalton Hubble	24d230505a	Add cloud-metadata.service on AWS fedora-atomic	2018-04-21 18:46:56 -07:00
Dalton Hubble	b3cf9508b6	Update Fedora Atomic modules to Kubernetes v1.10.1	2018-04-21 18:46:56 -07:00
Dalton Hubble	5212684472	Temporarily pin Fedora Atomic AMI * Atomic has published AMI images that shutdown immediately after being powered on	2018-04-21 18:46:56 -07:00
Dalton Hubble	f990473cde	Update control plane manifests and add etcd metrics * Enable etcd v3.3 metrics to expose metrics for scraping by Prometheus * Use k8s.gcr.io instead of gcr.io/google_containers * Add flexvolume plugin mount to controller manager * Update kube-dns from v1.14.8 to v1.14.9	2018-04-21 18:46:56 -07:00
Dalton Hubble	8523a086e2	Fix kubelet system container to mount CNI plugins * Mount /opt/cni/bin in kubelet system container so CNI plugin binaries can be found. Before, flannel worked because the kubelet falls back to flannel plugin baked into the hyperkube (undesired) * Move the CNI bin install location later, since /opt changes may be lost between ostree rebases	2018-04-21 18:46:56 -07:00
Dalton Hubble	19bc5aea9e	Use kubelet system container on fedora-atomic * Use the upstream hyperkube image packaged with the required metadata to be usable as a system container under systemd * Fix port-forward since socat is included	2018-04-21 18:46:56 -07:00
Dalton Hubble	8d7cfc1a45	Use etcd system container on fedora-atomic * Use the upstream etcd image packaged with the required metadata to be usable as a system container (runc) under systemd	2018-04-21 18:46:56 -07:00
Dalton Hubble	9969c357da	Change AWS Fedora module to fedora-atomic	2018-04-21 18:46:56 -07:00
Dalton Hubble	b80a2eb8a0	Sync fedora-cloud modules with Container Linux * Update manifests for Kubernetes v1.10.0 * Update etcd from v3.3.2 to v3.3.3 * Add disk_type optional variable on AWS * Remove redundant kubeconfig copy on AWS * Distribute etcd secres only to controllers * Organize module variables and ssh steps	2018-04-21 18:46:56 -07:00
Dalton Hubble	3610da8b71	Add fedora-cloud module for AWS	2018-04-21 18:46:56 -07:00
Dalton Hubble	a54f76db2a	Update Calico from v3.0.4 to v3.1.1 * https://github.com/projectcalico/calico/releases/tag/v3.1.1 * https://github.com/projectcalico/calico/releases/tag/v3.1.0	2018-04-21 18:30:36 -07:00
Dalton Hubble	23a8156bdf	Fix a few typos in comments	2018-04-15 17:21:49 -07:00
Dalton Hubble	77c0a4cf2e	Update Kubernetes from v1.10.0 to v1.10.1 * Use kubernetes-incubator/bootkube v0.12.0	2018-04-12 20:57:31 -07:00
Dalton Hubble	9bb3de5327	Skip creating unused dirs on worker nodes	2018-04-11 22:23:51 -07:00
Dalton Hubble	6b08bde479	Use k8s.gcr.io instead of gcr.io/google_containers * Kubernetes recommends using the alias to fetch images from the nearest GCR regional mirror, to abstract the use of GCR, and to drop names containing 'google' * https://groups.google.com/forum/#!msg/kubernetes-dev/ytjk_rNrTa0/3EFUHvovCAAJ	2018-04-08 12:57:52 -07:00
Dalton Hubble	f4b2396718	Return Prometheus deployment to be a worker workload * Expose etcd metrics to workers so Prometheus can run on a worker, rather than a controller * Drop temporary firewall rules allowing Prometheus to run on a controller and scrape targes * Related to https://github.com/poseidon/typhoon/pull/175	2018-04-08 12:20:00 -07:00
Dalton Hubble	18dbaf74ce	Update kube-dns from v1.14.8 to v1.14.9 * https://github.com/kubernetes/kubernetes/pull/61908	2018-04-04 21:00:23 -07:00
Dalton Hubble	ce001e9d56	Update etcd from v3.3.2 to v3.3.3 * https://github.com/coreos/etcd/releases/tag/v3.3.3	2018-04-04 20:32:24 -07:00
Dalton Hubble	d770393dbc	Add etcd metrics, Prometheus scrapes, and Grafana dash * Use etcd v3.3 --listen-metrics-urls to expose only metrics data via http://0.0.0.0:2381 on controllers * Add Prometheus discovery for etcd peers on controller nodes * Temporarily drop two noisy Prometheus alerts	2018-04-03 20:31:00 -07:00
Dalton Hubble	1cc043d1eb	Update Kubernetes from v1.9.6 to v1.10.0	2018-03-30 22:14:07 -07:00
Dalton Hubble	f8e9bfb1c0	Add disk_type variable for EBS volume type on AWS * Change EBS volume type from `standard` ("prior generation) to `gp2`. Prometheus alerts are tuned for SSDs * Other platforms have fast enough disks by default	2018-03-29 22:51:54 -07:00
Dalton Hubble	7acd4931f6	Remove redundant kubeconfig copy on AWS and GCP * AWS and Google Cloud make use of auto-scaling groups and managed instance groups, respectively. As such, the kubeconfig is already held in cloud user-data * Controller instances are provisioned with a kubeconfig from user-data. Its redundant to use a Terraform remote file copy step for the kubeconfig.	2018-03-26 00:01:47 -07:00
Dalton Hubble	e43cf9f608	Organize and cleanup variable descriptions	2018-03-25 21:44:43 -07:00
Dalton Hubble	a04ef3919a	Update Kubernetes from v1.9.5 to v1.9.6	2018-03-21 20:29:52 -07:00
Dalton Hubble	758c09fa5c	Update Kubernetes from v1.9.4 to v1.9.5	2018-03-19 00:25:44 -07:00
Dalton Hubble	f3730b2bfa	Add Container Linux Config snippets feature * Introduce the ability to support Container Linux Config "snippets" for controllers and workers on cloud platforms. This allows end-users to customize hosts by providing Container Linux configs that are additively merged into the base configs defined by Typhoon. Config snippets are validated, merged, and show any errors during `terraform plan` * Example uses include adding systemd units, network configs, mounts, files, raid arrays, or other disk provisioning features provided by Container Linux Configs (using Ignition low-level) * Requires terraform-provider-ct v0.2.1 plugin	2018-03-18 18:28:18 -07:00
Dalton Hubble	88aa9a46e5	Add /var/lib/calico volume mount to Calico DaemonSet	2018-03-18 16:40:38 -07:00
Dalton Hubble	efa90d8b44	Add a new key=value label to controller nodes * Add a node-role.kubernetes.io/controller="true" node label to controllers so Prometheus service discovery can filter to services that only run on controllers (i.e. masters) * Leave node-role.kubernetes.io/master="" untouched as its a Kubernetes convention	2018-03-18 16:39:10 -07:00
Dalton Hubble	21f2cef12f	Improve changelog, README, and index page	2018-03-12 20:58:02 -07:00
Dalton Hubble	931e311786	Update Kubernetes from v1.9.3 to v1.9.4	2018-03-12 18:07:50 -07:00
Dalton Hubble	8e7e6b9f7f	Normalize Terraform configs with terraform fmt	2018-03-11 14:46:05 -07:00
Dalton Hubble	35f3b1b28c	Enable AWS NLB cross-zone load balancing * https://github.com/terraform-providers/terraform-provider-aws/pull/3537 * https://aws.amazon.com/about-aws/whats-new/2018/02/network-load-balancer-now-supports-cross-zone-load-balancing/	2018-03-10 23:25:18 -08:00
Dalton Hubble	9fb1e1a0e2	Update etcd from v3.3.1 to v3.3.2 * https://github.com/coreos/etcd/releases/tag/v3.3.2	2018-03-10 13:44:35 -08:00
Dalton Hubble	b61d6373c5	Add ignore_changes for AWS worker image_id	2018-03-10 13:16:05 -08:00
Dalton Hubble	c112ee3829	Rename cluster_name to name in internal module * Ensure consistency between AWS and GCP platforms	2018-03-03 17:52:01 -08:00
Dalton Hubble	da6aafe816	Revert "Add module version requirements to internal workers modules" * This reverts commit `cce4537487`. * Provider passing to child modules is complex and the behavior changed between Terraform v0.10 and v0.11. We're continuing to allow both versions so this change should be reverted. For the time being, those using our internal Terraform modules will have to be aware of the minimum version for AWS and GCP providers, there is no good way to do enforcement.	2018-03-03 16:56:34 -08:00

1 2 3

110 Commits