Commit Graph

780 Commits

Author SHA1 Message Date
Dalton Hubble
bc750aec33 Configure Heapster to source metrics from Kubelet authenticated API
* Heapster can now get nodes (i.e. kubelets) from the apiserver and
source metrics from the Kubelet authenticated API (10250) instead of
the Kubelet HTTP read-only API (10255)
* https://github.com/kubernetes/heapster/blob/master/docs/source-configuration.md
* Use the heapster service account token via Kubelet bearer token
authn/authz.
* Permit Heapster to skip CA verification. The CA cert does not contain
IP SANs and cannot since nodes get random IPs that aren't known upfront.
Heapster obtains the node list from the apiserver, so the risk of
spoofing a node is limited. For the same reason, Prometheus scrapes
must skip CA verification for scraping Kubelet's provided by the apiserver.
* https://github.com/poseidon/typhoon/blob/v1.12.1/addons/prometheus/config.yaml#L68
* Create a heapster ClusterRole to work around the default Kubernetes
`system:heapster` ClusterRole lacking the proper GET `nodes/stats`
access. See https://github.com/kubernetes/heapster/issues/1936
2018-10-18 21:03:01 -07:00
Dalton Hubble
d55bfd5589 Fix CoreDNS AntiAffinity spec to prefer spreading replicas
* Pods were still being scheduled at random due to a typo
2018-10-17 22:19:57 -07:00
Robert Fairburn
0be4673e44 Add disk_iops variable for AWS
* Setting disk_iops is required for disk_type io1
* https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html#EBSVolumeTypes
2018-10-17 22:18:54 -07:00
Dalton Hubble
3b44972d78 Add links to header to CHANGES 2018-10-17 09:08:58 -07:00
Dalton Hubble
0127ee82c1 Update nginx-ingress from v0.19.0 to v0.20.0 2018-10-16 21:35:29 -07:00
Dalton Hubble
a10d6977b8 Update Prometheus from v2.4.2 to v2.4.3
* https://github.com/prometheus/prometheus/releases/tag/v2.4.3
2018-10-16 21:29:41 -07:00
Dalton Hubble
05fe923c14 Update Grafana from v5.3.0 to v5.3.1
* https://github.com/grafana/grafana/releases/tag/v5.3.1
2018-10-16 21:23:44 -07:00
Michael Schubert
d10620fb58 Add support for Flatcar Linux bare-metal cached_install
* Support bare-metal cached_install=true mode with Flatcar Linux
where assets are fetched from the Matchbox assets cache instead
of from the upstream Flatcar download server
* Skipped in original Flatcar support to keep it simple
https://github.com/poseidon/typhoon/pull/209
2018-10-16 21:15:24 -07:00
Dalton Hubble
9b6113a058 Update Kubernetes from v1.11.3 to v1.12.1
* Mount an empty dir for the controller-manager to work around
https://github.com/kubernetes/kubernetes/issues/68973
* Update coreos/pod-checkpointer to strip affinity from
checkpointed pod manifests. Kubernetes v1.12.0-rc.1 introduced
a default affinity that appears on checkpointed manifests; but
it prevented scheduling and checkpointed pods should not have an
affinity, they're run directly by the Kubelet on the local node
* https://github.com/kubernetes-incubator/bootkube/issues/1001
* https://github.com/kubernetes/kubernetes/pull/68173
2018-10-16 20:28:13 -07:00
Dalton Hubble
5eb4078d68 Add docker/default seccomp to control plane and addons
* Annotate pods, deployments, and daemonsets to start containers
with the Docker runtime's default seccomp profile
* Overrides Kubernetes default behavior which started containers
with seccomp=unconfined
* https://docs.docker.com/engine/security/seccomp/#pass-a-profile-for-a-container
2018-10-16 20:07:29 -07:00
Dalton Hubble
8f0d2b5db4 Update Grafana from v5.2.4 to v5.3.0 2018-10-13 23:03:31 -07:00
Dalton Hubble
2e89e161e9 Remove Azure admin_password (disabled) now that its optional
* Requires terraform-provider-azurerm v1.16.0 or higher
https://github.com/terraform-providers/terraform-provider-azurerm/pull/1958
2018-10-13 22:40:58 -07:00
Dalton Hubble
55bb4dfba6 Raise CoreDNS replica count to 2 or more
* Run at least two replicas of CoreDNS to better support
rolling updates (previously, kube-dns had a pod nanny)
* On multi-master clusters, set the CoreDNS replica count
to match the number of masters (e.g. a 3-master cluster
previously used replicas:1, now replicas:3)
* Add AntiAffinity preferred rule to favor distributing
CoreDNS pods across controller nodes nodes
2018-10-13 20:31:29 -07:00
Dalton Hubble
43fe78a2cc Raise scheduler/controller-manager replicas in multi-master
* Continue to ensure scheduler and controller-manager run
at least two replicas to support performing kubectl edits
on single-master clusters (no change)
* For multi-master clusters, set scheduler / controller-manager
replica count to the number of masters (e.g. a 3-master cluster
previously used replicas:2, now replicas:3)
2018-10-13 16:16:29 -07:00
Dalton Hubble
5a283b6443 Update etcd from v3.3.9 to v3.3.10
* https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.3.md#v3310-2018-10-10
2018-10-13 13:14:37 -07:00
Stephen Cuppett
ff0c271d7b Fix Calico network policy docs link 2018-10-07 21:02:16 +02:00
Josh Reichardt
89088b6a99 Fix broken README links 2018-10-07 20:56:13 +02:00
Dalton Hubble
ee569e3a59 Change rooted links to absolute links for mkdocs v1.0
* mkdocs v1.0 now evaluates links beginning with / as
absolute links that are only prepended with the site url
2018-10-07 20:51:54 +02:00
Dalton Hubble
daeaec0832 Fix requirements for docs builds 2018-10-07 19:26:55 +02:00
Dalton Hubble
db36036c81 Require terraform-provider-digitalocean plugin ~> 1.0
* Require a terraform-provider-digitalocean plugin version of
1.0 or higher within the same major version (e.g. allow 1.1 but
not 2.0)
* Change requirement from ~> 0.1.2 (which allowed up to but not
including 1.0 release)
2018-10-02 17:09:19 +02:00
Dalton Hubble
7653e511be Update CoreDNS and Calico versions
* Update CoreDNS from 1.1.3 to 1.2.2
* Update Calico from v3.2.1 to v3.2.3
2018-10-02 16:07:48 +02:00
Dennis Schridde
f8474d68c9 Fix typo in maintenance docs 2018-09-25 03:07:00 -07:00
Dalton Hubble
032a24133b Update Prometheus from v2.3.2 to v2.4.2
* https://github.com/prometheus/prometheus/releases/tag/v2.4.0
* https://github.com/prometheus/prometheus/releases/tag/v2.4.1
* https://github.com/prometheus/prometheus/releases/tag/v2.4.2
2018-09-21 22:27:11 -07:00
Dalton Hubble
f5f442b658 Improve the issue template prompt
* Clarify that reporting versions as latest is not
helpful
2018-09-15 10:05:07 -07:00
Dalton Hubble
ad871dbfa9 Update Kubernetes from v1.11.2 to v1.11.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1113
2018-09-13 18:50:41 -07:00
Sendhil Panchadsaram
5801be53ff Fix typo in ingress addon docs 2018-09-08 18:28:42 -07:00
Sendhil Panchadsaram
c1b1669cf8 Fix Google Cloud preemption link in docs 2018-09-08 18:28:25 -07:00
Dalton Hubble
dc03f7a4a9 Update nginx-ingress from 0.17.1 to 0.19.0
* If using --enable-ssl-passthrough or exposing TCP/UDP services,
be aware of https://github.com/kubernetes/ingress-nginx/pull/3038
* Workarounds until the fix merges are to stay on 0.17.1, use the
suggested development image, or revert to securityContext
`runAsNonRoot: false` for a while (less secure)
2018-09-08 17:57:01 -07:00
Dalton Hubble
1b8234eb91 Update Grafana from v5.2.2 to v5.2.4
* https://github.com/grafana/grafana/releases/tag/v5.2.3
* https://github.com/grafana/grafana/releases/tag/v5.2.4
2018-09-08 15:41:20 -07:00
Dalton Hubble
4ba090feb0 Update kube-state-metrics from v1.3.1 to v1.4.0 2018-08-29 09:37:50 -07:00
Dalton Hubble
4882fe1053 Add docs for Azure Ingress and worker pools
* Azure worker pools must be in the same region as
the cluster itself unfortunately
2018-08-27 23:30:56 -07:00
Dalton Hubble
019009e9ee Add outputs for Azure ingress IPv4 and worker pools 2018-08-27 23:30:32 -07:00
Dalton Hubble
991a5c6cee Add new tutorial docs and links 2018-08-27 23:30:32 -07:00
Dalton Hubble
c60ec642bc Fix Azure delete-node script to lowercase hostnames
* Fix issue where worker nodes didn't delete themselves on
scale-down or deallocation (e.g. low priority instances).
Lowercase the hostname and delete the Kubernetes node
* Kubelet registers the lowercase hostname as the node name,
but Azure workers get hostname CLUSTER-worker-GENERATED where
the generated identifier may contain uppercase characters
2018-08-27 23:30:32 -07:00
Dalton Hubble
38b4ff4700 Add module for Typhoon Azure with Container Linux 2018-08-27 23:30:32 -07:00
Dalton Hubble
7eb09237f4 Update Calico from v3.1.3 to v3.2.1
* Add new bird and felix readiness checks
* Read MTU from ConfigMap veth_mtu
* Add RBAC read for serviceaccounts
* Remove invalid description from CRDs
2018-08-25 17:53:11 -07:00
Dalton Hubble
e58b424882 Fix firewall to allow etcd client traffic between controllers
* Broaden internal-etcd firewall rule to allow etcd client
traffic (2379) from other controller nodes
* Previously, kube-apiservers were only able to connect to their
node's local etcd peer. While master node outages were tolerated,
reaching a healthy peer took longer than neccessary in some cases
* Reduce time needed to bootstrap a cluster
2018-08-21 23:51:40 -07:00
Dalton Hubble
b8eeafe4f9 Template etcd_servers list to replace null_resource.repeat
* Remove the last usage of null_resource.repeat, which has
always been an eyesore for creating the etcd server list
* Originally, #224 switched to templating the etcd_servers
list for all clouds, but had to revert on GCP in #237
* https://github.com/poseidon/typhoon/pull/224
* https://github.com/poseidon/typhoon/pull/237
2018-08-21 22:46:24 -07:00
Dalton Hubble
bdf1e6986e Fix terraform fmt 2018-08-21 21:59:55 -07:00
Dalton Hubble
99e3721181 Mention Fedora Atomic system container artifacts
* Typhoon for Fedora Atomic uses system containers, container
images containing metadata, but built directly from upstream
and published and serve through Quay.io
* https://github.com/poseidon/system-containers
2018-08-21 21:52:52 -07:00
Dalton Hubble
ea365b551a Fix docs mentions of ELBs to NLBs
* Typhoon AWS clusters use an NLB rather than an ELB,
since v1.10.5
* Add a few missing links in CHANGES
2018-08-21 21:40:06 -07:00
Dalton Hubble
bbf2c13eef Remove AWS security rule allowing ICMP packets to nodes
* Deny ICMP packets for consistency across Typhoon clusters on
various clouds and because there isn't much need to allow them
2018-08-21 21:16:16 -07:00
Dalton Hubble
da5d2c5321 Remove GCP firewall rule allowing Nginx Ingress health
* Nginx Ingress addon no longer uses hostNework so Prometheus may
scrape port 10254 via the CNI network, rather than via the host
address
2018-08-21 21:06:03 -07:00
Dalton Hubble
bceec9fdf5 Sort firewall / security rules and add comments
* No functional changes to network firewalls
2018-08-21 20:53:16 -07:00
Becca Powell
49a9dc9b8b Fix typo in Prometheus alerting rules 2018-08-21 16:55:49 -07:00
Dalton Hubble
bec5250e73 Remove unofficial bare-metal *_networkds variables
* Remove controller_networkds and worker_networkds variables. These
variables were always listed as experimental, unsupported, and excluded
from documentation in anticipation of Container Linux Config snippets
* Use Container Linux Config snippets on bare-metal instead. They
provide safer, more powerful, and more elegant host customization
2018-08-13 23:33:29 -07:00
Dalton Hubble
dbdc3fc850 Add nginx-ingress addon manifests for bare-metal 2018-08-11 12:14:23 -07:00
Dalton Hubble
e00f97c578 Update nginx-ingress from 0.16.2 to 0.17.1
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.17.1
* https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.17.0
2018-08-08 00:45:20 -07:00
Dalton Hubble
f7ebdf475d Update Kubernetes from v1.11.1 to v1.11.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1112
2018-08-07 21:57:25 -07:00
Dalton Hubble
716dfe4d17 Fix cluster links in customization docs 2018-07-29 12:46:59 -07:00