Commit Graph

315 Commits

Author SHA1 Message Date
Vincent Palmer ce49a93d5d Fix issue with etcd-member failing to resolve peers
* When restarting masters, `etcd-member.service` may fail to lookup peers if
/etc/resolv.conf hasn't been populated yet. Require the wait-for-dns.service.
2017-12-09 20:12:49 -08:00
Khris Richardson e623439eec Fix typos in docs and CONTRIBUTING.md 2017-12-09 19:58:09 -08:00
Dalton Hubble 9548572d98 Add kubelet --volume-plugin-dir flag on bare-metal
* Kubelet will search path for flexvolume plugins
2017-12-05 13:12:53 -08:00
Dalton Hubble f00ecde854 Rollback nginx-ingress on GCE to 0.9.0-beta.17
* https://github.com/kubernetes/ingress-nginx/issues/1788
2017-12-02 14:06:22 -08:00
Dalton Hubble d85300f947 Clarify only Terraform v0.10.x should be used
* It is not safe to update to Terraform v0.11.x yet
* https://github.com/hashicorp/terraform/issues/16824
2017-12-02 01:31:39 -08:00
Dalton Hubble 65f006e6cc addons: Sync prometheus alerts to upstream
* https://github.com/coreos/prometheus-operator/pull/774
2017-12-01 23:24:08 -08:00
Dalton Hubble 8d3817e0ae addons: Update nginx-ingress to 0.9.0-beta.19
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.9.0-beta.19
2017-12-01 22:32:33 -08:00
Dalton Hubble 5f5eec1175 Update bootkube and terraform-render-bootkube to v0.9.0 2017-12-01 22:27:48 -08:00
Dalton Hubble 5308fde3d3 Add Kubernetes certification badge 2017-11-29 19:26:49 -08:00
Dalton Hubble 9ab61d7bf5 Add Typhoon images with and without text
* Serve images from GCS poseidon, rather than dghubble
2017-11-29 01:01:01 -08:00
Dalton Hubble 6483f613c5 Update Kubernetes from v1.8.3 to v1.8.4 2017-11-28 21:52:11 -08:00
Dalton Hubble 56c6bf431a Update terraform-render-bootkube for Kubernetes v1.8.4
* Update hyperkube from v1.8.3 to v1.8.4
* Remove flock from bootstrap-apiserver and kube-apiserver
* Remove unused critical-pod annotations in manifests
* Use service accounts for kube-proxy and pod-checkpointer
* Update Calico from v2.6.1 to v2.6.3
* Update flannel from v0.9.0 to v0.9.1
* Remove Calico termination grace period to prevent calico
from getting stuck for extended periods
* https://github.com/poseidon/terraform-render-bootkube/pull/29
2017-11-28 21:42:26 -08:00
Dalton Hubble 63ab117205 addons: Add prometheus rules for DaemonSets
* https://github.com/coreos/prometheus-operator/pull/755
2017-11-16 23:51:21 -08:00
Dalton Hubble 1cd262e712 addons: Fix prometheus K8SApiServerLatency alert rule
* https://github.com/coreos/prometheus-operator/issues/751
2017-11-16 23:37:15 -08:00
Dalton Hubble 32bdda1b6c addons: Update Grafana from v4.6.1 to v4.6.2
* https://github.com/grafana/grafana/releases/tag/v4.6.2
2017-11-16 23:34:36 -08:00
Dalton Hubble 07d257aa7b Add initrd kernel argument needed by UEFI clients
* https://github.com/coreos/bugs/issues/1239
2017-11-16 23:19:51 -08:00
Dalton Hubble fd96067125 Fix docs link for security issue reporting 2017-11-10 21:38:41 -08:00
Dalton Hubble 9d16f5c78a Update min Google plugin and remove target pool workaround
* With google provider 1.2, target pool instances can use self_link
and zone/name formats without causing a diff on each plan
* Original workaround: 77fc14db71
2017-11-10 21:15:19 -08:00
Dalton Hubble 159443bae7 addons: Add better alerting rules to Prometheus manifests
* Adapt the coreos/prometheus-operator alerting rules for Typhoon,
https://github.com/coreos/prometheus-operator/tree/master/contrib/kube-prometheus/manifests
* Add controller manager and scheduler shim services to let
prometheus discover them via service endpoints
* Fix several alert rules to use service endpoint discovery
* A few rules still don't do much, but they default to green
2017-11-10 20:57:47 -08:00
Dalton Hubble 119dc859d3 addons: Update nginx-ingress to 0.9.0-beta.17
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.9.0-beta.17
2017-11-10 20:16:40 -08:00
Dalton Hubble 5f6b0728c5 Update bootkube and terraform-render-bootkube to v0.8.2 2017-11-10 20:01:37 -08:00
Dalton Hubble d774c51297 Update Kubernetes from v1.8.2 to v1.8.3 2017-11-08 23:34:19 -08:00
Dalton Hubble f6a8fb363e Remove deprecated kubelet --require-kubeconfig flag
* https://github.com/kubernetes/kubernetes/pull/40050
2017-11-08 23:34:19 -08:00
Dalton Hubble f570af9418 addons: Update from Prometheus v1.8.2 to v2.0.0 2017-11-08 22:48:23 -08:00
Dalton Hubble 4ec6732b98 Output the Google network name and self_link
* Allow users to add custom firewall rules for unique cases
2017-11-08 00:19:49 -08:00
Dalton Hubble ea1efb536a Remove old firewall rule for bootstrap self-hosted etcd 2017-11-08 00:15:20 -08:00
Dalton Hubble 451fd86470 Improve internal firewall rules on Google Cloud
* Whitelist internal traffic between controllers and workers
* Switch to tag-based firewall policies rather than source IP
2017-11-08 00:15:06 -08:00
Dalton Hubble b1b611b22c Add docs to use one controller on Google Cloud 2017-11-07 19:51:03 -08:00
Dalton Hubble eabf00fbf1 Add missing controller dependency before bootkube start
* Require the controller module to be completed before starting
to remote exec bootkube start, otherwise its possible the controller
nodes were created, but not the network load balancer
2017-11-07 19:12:05 -08:00
Dalton Hubble 8eaa72c1ca addons: Update nginx-ingress to 0.9.0-beta.16
* Image registry changed from gcr.io to quay.io
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.9.0-beta.16
2017-11-06 23:15:15 -08:00
Dalton Hubble 58cf82da56 Promote AWS platform from alpha to beta 2017-11-06 21:38:24 -08:00
Dalton Hubble ccc832f468 Add firewall rule to allow apiserver to proxy other controller kubelets
* Prometheus proxies through the apiserver to scrape kubelets
* In multi-controller setups, an apiserver must be able to scrape
kubelets (10250) on other controllers
2017-11-06 01:03:53 -08:00
Dalton Hubble 90f8d62204 Add firewall rules to allow prometheus to reach node-exporter
* node_exporter service endpoints run on hostNetwork port 9100
* Re-evaluate after https://github.com/kubernetes-incubator/bootkube/pull/711
2017-11-06 01:03:53 -08:00
Dalton Hubble af5c413abf Focus controller ELB on load balancing apiservers
* ELB distributing load across controllers is no longer the mechanism
used to SSH to instances to distribute secrets
* Focus the ELB on load balancing across apiserver and edit the HTTP
health check to an SSL:443 check
2017-11-06 01:03:53 -08:00
Dalton Hubble 168c487484 Remove mention of self-hosted etcd, its deprecated 2017-11-06 01:03:53 -08:00
Dalton Hubble 805dd772a8 Run etcd cluster on-host, across controllers on AWS
* Change controllers ASG to heterogeneous EC2 instances
* Create DNS records for each controller's private IP for etcd
* Change etcd to run on-host, across controllers (etcd-member.service)
* Reduce time to bootstrap a cluster
* Deprecate self-hosted-etcd on the AWS platform
2017-11-06 01:03:53 -08:00
Dalton Hubble c6ec6596d8 Minor cleanup for zones, docs, and outputs
* Spread across all zones, regardless of UP/DOWN state
* Remove unused outputs of private IPs
2017-11-06 00:56:26 -08:00
Dalton Hubble 47a9989927 Fix null_resource ordering constraints
* Ensure etcd TLS assets and kubeconfig are copied before
any attempt is made to run bootkube start
2017-11-06 00:55:44 -08:00
Dalton Hubble 10b977d54a addons: Set kube-state-metrics to have clusterIP None
* kube-state-metrics service exists to facilitate prometheus discovery
2017-11-05 17:54:09 -08:00
Dalton Hubble b7a268fc45 addons: Add prometheus alertmanager flag
* Pass -alertmanager.url to work with a user's in-cluster
alertmanager deployment, if any
2017-11-05 15:50:46 -08:00
Dalton Hubble 279f36effd addons: Add grafana 4.6.1 and extend prometheus docs 2017-11-05 15:23:56 -08:00
Dalton Hubble 77fc14db71 Workaround target pool issue by listing instances as zone/name
* Instances can be listed by zone/name or self_link URL, but the
provider desires they be in zone/name form, which causes a diff
* https://github.com/terraform-providers/terraform-provider-google/issues/46
2017-11-05 14:07:05 -08:00
Dalton Hubble 2b0296d671 Create controller instances across zones in the region
* Change controller instances to automatically span zones in a region
* Remove the `zone` required variable
2017-11-05 13:24:32 -08:00
Dalton Hubble 7b38271212 Run etcd cluster on-host, across controllers on Google Cloud
* Change controllers from a managed group to individual instances
* Create discrete DNS records to each controller's private IP for etcd
* Change etcd to run on-host, across controllers (etcd-member.service)
* Reduce time to bootstrap a cluster
* Deprecate self-hosted-etcd on the Google Cloud platform
2017-11-05 11:03:35 -08:00
Dalton Hubble ae07a21e3d addons: Omit static resource requests/limits for kube-state-metrics
* Allow the addon-resizer to dynamically set resource values
* https://github.com/kubernetes/kube-state-metrics/pull/285
2017-11-04 14:41:04 -07:00
Dalton Hubble 0ab1ae3210 addons: Fix typo in kube-state-metrics strategy 2017-11-04 14:39:56 -07:00
Dalton Hubble 67e3d2b86e docs: GCE network bandwidth is excellent, even btw zones
* Remove performance note that the GCE vs AWS network performance
is not an equal comparison. On both platforms, workers now span the
(availability) zones of a region.
* Testing host-to-host and pod-to-pod network bandwidth between nodes
(now located in different zones) showed no reduction in bandwidth
2017-11-04 14:08:20 -07:00
Dalton Hubble a48dd9ebd8 Require google provider version ~> 1.1
* Require google provider plugin 1.1 or higher which includes fix:
https://github.com/terraform-providers/terraform-provider-google/issues/574
* Remove workaround which statically set the persistent disk name
* Original reasons for workaround in a97df839 or GH #34
2017-11-04 12:59:19 -07:00
Dalton Hubble 26a291aef4 Remove controller_preemptible option on Google Cloud
* Controller preemption is not safe or covered in documentation. Delete
the option, the variable is a holdover from old experiments
* Note, worker_preemeptible is still a great feature that's supported
2017-11-04 12:59:19 -07:00
Dalton Hubble 251a14519f Fix typo in internal template variable name
* ssh_authorized_keys should be ssh_authorized_key to match the user
facing variable which only allows a single SSH authorized key
2017-11-04 12:59:19 -07:00