Commit Graph

239 Commits

Author SHA1 Message Date
Dalton Hubble e483c81ce9 Improve Prometheus rules and alerts and Grafana dashboards
* Collate upstream rules, alerts, and dashboards and tune for use
in Typhoon
* Previously, a well-chosen (but older) set of rules, alerts, and
dashboards were maintained to reflect metric name changes
2019-02-18 12:19:23 -08:00
Dalton Hubble d988822741 Document and recommend terraform-provider-matchbox v0.2.3
* https://github.com/coreos/terraform-provider-matchbox/releases/tag/v0.2.3
2019-02-16 15:07:49 -08:00
Dalton Hubble ccd96c37da Update Kubernetes from v1.13.2 to v1.13.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1133
2019-02-01 23:26:13 -08:00
Carlos Cobo acd539f865 Fix architecture title for DigitalOcean (#390) 2019-02-01 23:20:06 -08:00
Dalton Hubble d02af3d40d Update mkdocs-material from v3.2.0 to v3.3.0
* Fix minor docs typos and errors
* Allow a transient verison of the six PyPi package, the
docs build system can use the 0.12.0 (0.11.0 broke sync
tools so pinning to 0.10.0 was previously needed)
2019-01-29 23:16:57 -08:00
Dalton Hubble 6b87132aa1 Fix per platform/OS links on the docs home page
* Considering the reader of each, the Github README module links
can go to module source code and docs module links can go to the
associated tutorial docs for the platform/OS
2019-01-26 16:50:00 -08:00
Dalton Hubble 1d66ad33f7 Change AWS worker modules' default type from t2.small to t3.small
* Worker instance types weren't updated in #365
2019-01-12 00:07:48 -08:00
Dalton Hubble 4d32b79c6f Update Kubernetes from v1.13.1 to v1.13.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1132
2019-01-12 00:00:53 -08:00
Dalton Hubble 1c6a0392ad Fix missing slash in links in the AWS tutorial 2019-01-02 23:33:02 -08:00
Dalton Hubble f2f4deb8bb Change AWS default type from t2.small to t3.small
* T3 is the next generation general purpose burstable
instance type. Compared with t2.small, the t3.small is
cheaper, has 2 vCPU (instead of 1) and provides 5 Gbps
of pod-to-pod bandwidth (instead of 1 Gbps)
2018-12-18 12:38:35 -08:00
Dalton Hubble d42f47c49e Update terraform-provider-ct plugin from v0.2.1 to v0.3.0
* Provide migration instructions for upgrading terraform-provider-ct
in-place for v1.12.2+ clusters
* Require switching from ~/.terraformrc to the Terraform third-party
plugins directory ~/.terraform.d/plugins/
* Require Container Linux 1688.5.3 or newer
2018-12-17 14:13:50 -08:00
Dalton Hubble 018c5edc25 Update Kubernetes from v1.13.0 to v1.13.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1131
2018-12-15 11:44:57 -08:00
Dalton Hubble d31f444fcd Update Kubernetes from v1.12.3 to v1.13.0 2018-12-03 20:44:32 -08:00
Dalton Hubble 42c523e6a2 Recommend switch from ~/.terraformrc to 3rd-party plugin dir
* Switch tutorials from using ~/.terraformrc to using the 3rd-party
plugin directory so 3rd-party plugins can be pinned
* Continue to show using terraform-provider-ct v0.2.2. Updating to
a newer version is only safe once all managed clusters are v1.12.2
or higher
2018-11-28 00:03:15 -08:00
Dalton Hubble 64b4c10418 Improve features and modules list docs
* Remove bullet about isolating workloads on workers, its
now common practice and new users will assume it
* List advanced features available in each module
* Fix erroneous Kubernetes version listing for Google Cloud
Fedora Atomic
2018-11-26 22:58:00 -08:00
Dalton Hubble 5b27d8d889 Update Kubernetes from v1.12.2 to v1.12.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.12.md/#v1123
2018-11-26 21:06:09 -08:00
Dalton Hubble 7f8e781ae4 Measure DigitalOcean network performance
* Measuring pod-to-pod bandwidth in a few regions (NYC3, FRA1,
SFO1) shows DigitalOcean has made some improvements
2018-11-11 21:08:10 -08:00
Dalton Hubble 31f48a81a8 Update docs to show flannel DaemonSet instead of kube-flannel
* No functional change, the rename is just for consistency
2018-11-10 15:16:06 -08:00
Dalton Hubble 721c847943 Set kube-apiserver kubelet preferred address types
* Prefer InternalIP and ExternalIP over the node's hostname,
to match upstream behavior and kubeadm
* Previously, hostname-override was used to set node names
to internal IP's to work around some cloud providers not
resolving hostnames for instances (e.g. DO droplets)
2018-11-03 22:31:55 -07:00
Dalton Hubble 5be5b261e2 Add an IPv6 address and forwarding rules on Google Cloud
* Allowing serving IPv6 applications via Kubernetes Ingress
on Typhoon Google Cloud clusters
* Add `ingress_static_ipv6` output variable for use in AAAA
DNS records
2018-10-28 14:30:58 -07:00
Dalton Hubble f034ef90ae Add DigitalOcean AAAA DNS records resolving to workers
* Improve the workers "round-robin" DNS FQDN that is created
with each cluster by adding AAAA records
* CNAME's resolving to the DigitalOcean `workers_dns` output
can be followed to find a droplet's IPv4 or IPv6 address
* The CNI portmap plugin doesn't support IPv6. Hosting IPv6
apps is possible, but requires editing the nginx-ingress
addon with `hostNetwork: true`
2018-10-27 23:09:24 -07:00
Dalton Hubble f1da0731d8 Update Kubernetes from v1.12.1 to v1.12.2
* Update CoreDNS from v1.2.2 to v1.2.4
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.12.md#v1122
* https://coredns.io/2018/10/17/coredns-1.2.4-release/
* https://coredns.io/2018/10/16/coredns-1.2.3-release/
2018-10-27 15:47:57 -07:00
Robert Fairburn 0be4673e44 Add disk_iops variable for AWS
* Setting disk_iops is required for disk_type io1
* https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html#EBSVolumeTypes
2018-10-17 22:18:54 -07:00
Michael Schubert d10620fb58 Add support for Flatcar Linux bare-metal cached_install
* Support bare-metal cached_install=true mode with Flatcar Linux
where assets are fetched from the Matchbox assets cache instead
of from the upstream Flatcar download server
* Skipped in original Flatcar support to keep it simple
https://github.com/poseidon/typhoon/pull/209
2018-10-16 21:15:24 -07:00
Dalton Hubble 9b6113a058 Update Kubernetes from v1.11.3 to v1.12.1
* Mount an empty dir for the controller-manager to work around
https://github.com/kubernetes/kubernetes/issues/68973
* Update coreos/pod-checkpointer to strip affinity from
checkpointed pod manifests. Kubernetes v1.12.0-rc.1 introduced
a default affinity that appears on checkpointed manifests; but
it prevented scheduling and checkpointed pods should not have an
affinity, they're run directly by the Kubelet on the local node
* https://github.com/kubernetes-incubator/bootkube/issues/1001
* https://github.com/kubernetes/kubernetes/pull/68173
2018-10-16 20:28:13 -07:00
Dalton Hubble 2e89e161e9 Remove Azure admin_password (disabled) now that its optional
* Requires terraform-provider-azurerm v1.16.0 or higher
https://github.com/terraform-providers/terraform-provider-azurerm/pull/1958
2018-10-13 22:40:58 -07:00
Stephen Cuppett ff0c271d7b Fix Calico network policy docs link 2018-10-07 21:02:16 +02:00
Dalton Hubble ee569e3a59 Change rooted links to absolute links for mkdocs v1.0
* mkdocs v1.0 now evaluates links beginning with / as
absolute links that are only prepended with the site url
2018-10-07 20:51:54 +02:00
Dalton Hubble db36036c81 Require terraform-provider-digitalocean plugin ~> 1.0
* Require a terraform-provider-digitalocean plugin version of
1.0 or higher within the same major version (e.g. allow 1.1 but
not 2.0)
* Change requirement from ~> 0.1.2 (which allowed up to but not
including 1.0 release)
2018-10-02 17:09:19 +02:00
Dennis Schridde f8474d68c9 Fix typo in maintenance docs 2018-09-25 03:07:00 -07:00
Dalton Hubble ad871dbfa9 Update Kubernetes from v1.11.2 to v1.11.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1113
2018-09-13 18:50:41 -07:00
Sendhil Panchadsaram 5801be53ff Fix typo in ingress addon docs 2018-09-08 18:28:42 -07:00
Sendhil Panchadsaram c1b1669cf8 Fix Google Cloud preemption link in docs 2018-09-08 18:28:25 -07:00
Dalton Hubble 4882fe1053 Add docs for Azure Ingress and worker pools
* Azure worker pools must be in the same region as
the cluster itself unfortunately
2018-08-27 23:30:56 -07:00
Dalton Hubble 019009e9ee Add outputs for Azure ingress IPv4 and worker pools 2018-08-27 23:30:32 -07:00
Dalton Hubble 991a5c6cee Add new tutorial docs and links 2018-08-27 23:30:32 -07:00
Dalton Hubble e58b424882 Fix firewall to allow etcd client traffic between controllers
* Broaden internal-etcd firewall rule to allow etcd client
traffic (2379) from other controller nodes
* Previously, kube-apiservers were only able to connect to their
node's local etcd peer. While master node outages were tolerated,
reaching a healthy peer took longer than neccessary in some cases
* Reduce time needed to bootstrap a cluster
2018-08-21 23:51:40 -07:00
Dalton Hubble 99e3721181 Mention Fedora Atomic system container artifacts
* Typhoon for Fedora Atomic uses system containers, container
images containing metadata, but built directly from upstream
and published and serve through Quay.io
* https://github.com/poseidon/system-containers
2018-08-21 21:52:52 -07:00
Dalton Hubble ea365b551a Fix docs mentions of ELBs to NLBs
* Typhoon AWS clusters use an NLB rather than an ELB,
since v1.10.5
* Add a few missing links in CHANGES
2018-08-21 21:40:06 -07:00
Dalton Hubble dbdc3fc850 Add nginx-ingress addon manifests for bare-metal 2018-08-11 12:14:23 -07:00
Dalton Hubble f7ebdf475d Update Kubernetes from v1.11.1 to v1.11.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1112
2018-08-07 21:57:25 -07:00
Dalton Hubble 716dfe4d17 Fix cluster links in customization docs 2018-07-29 12:46:59 -07:00
Dalton Hubble 90c4a7483d Combine bare-metal CLC snippets maps into one map 2018-07-26 23:31:08 -07:00
Dalton Hubble 4e7dfc115d Support Container Linux Config snippets on bare-metal 2018-07-25 23:14:54 -07:00
Dalton Hubble ec5ea51141 Remove old migration docs and fix link 2018-07-22 12:23:37 -07:00
Dalton Hubble d8d524d10b Update Kubernetes from v1.11.0 to v1.11.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#v1111
2018-07-20 00:41:27 -07:00
Dalton Hubble 3352388fe6 Update changelog and docs for release 2018-07-04 12:28:25 -07:00
Dalton Hubble 915f89d3c8 Update Fedora Atomic from 27 to 28 on bare-metal 2018-07-04 11:41:54 -07:00
Dalton Hubble 6f958d7577 Replace kube-dns with CoreDNS
* Add system:coredns ClusterRole and binding
* Annotate CoreDNS for Prometheus metrics scraping
* Remove kube-dns deployment, service, & service account
* https://github.com/poseidon/terraform-render-bootkube/pull/71
* https://kubernetes.io/blog/2018/06/27/kubernetes-1.11-release-announcement/
2018-07-01 22:55:01 -07:00
Dalton Hubble ee31074679 Promote Typhoon Google Cloud for Container Linux to stable 2018-07-01 22:52:27 -07:00
Dalton Hubble 97517fa7f3 Fix ingress addons docs to use ingress_static_ipv4 var 2018-07-01 22:48:41 -07:00
Dalton Hubble 18502d64d6 Update Fedora Atomic from 27 to 28 on GCP 2018-07-01 22:46:51 -07:00
Dalton Hubble 8464b258d8 Update Kubernetes from v1.10.5 to v1.11.0
* Force apiserver to stop listening on 127.0.0.1:8080
* Remove deprecated Kubelet `--allow-privileged`. Defaults to
true. Use `PodSecurityPolicy` if limiting is desired
* https://github.com/kubernetes/kubernetes/releases/tag/v1.11.0
* https://github.com/poseidon/terraform-render-bootkube/pull/68
2018-06-27 22:47:35 -07:00
Dalton Hubble 855aec5af3 Clarify AWS module output names and changes 2018-06-23 15:29:13 -07:00
Dalton Hubble f4d3059b00 Update Kubernetes from v1.10.4 to v1.10.5
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.10.md#v1105
2018-06-21 22:51:39 -07:00
Dalton Hubble 6c5a1964aa Change kube-apiserver port from 443 to 6443
* Adjust firewall rules, security groups, cloud load balancers,
and generated kubeconfig's
* Facilitates some future simplifications and cost reductions
* Bare-Metal users who exposed kube-apiserver on a WAN via their
router or load balancer will need to adjust its configuration.
This is uncommon, most apiserver are on LAN and/or behind VPN
so no routing infrastructure is configured with the port number
2018-06-19 23:48:51 -07:00
Dalton Hubble 0764bd30b5 Fix typo in AWS MTU tip for using jumbo packets 2018-06-11 18:11:50 -07:00
Dalton Hubble 79260c48f6 Update Kubernetes from v1.10.3 to v1.10.4 2018-06-06 23:23:11 -07:00
Dalton Hubble 4d75ae1373 Recommend against AWS controllers smaller than t2.small 2018-05-30 22:57:15 -07:00
Dalton Hubble 4ac4d7cbaf Add docs fixes and Flatcar Linux announcement 2018-05-22 21:22:50 -07:00
Dalton Hubble 4ea1fde9c5 Update Kubernetes from v1.10.2 to v1.10.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.10.md#v1103
* Update Calico from v3.1.1 to v3.1.2
2018-05-21 21:38:43 -07:00
Dalton Hubble 0c3557e68e Allow Flatcar Linux os_channel on bare-metal
* Choose the Container Linux derivative Flatcar Linux on
bare-metal by setting os_channel to flatcar-stable, flatcar-beta
or flatcar-alpha
* As with Container Linux from Red Hat, the version (os_version)
must correspond to the channel being used
* Thank you to @dongsupark from Kinvolk
2018-05-17 20:09:36 -07:00
Dalton Hubble adc6c6866d Rename container_linux_ bare-metal variables
* Allow for Container Linux derivatives
* Replace container_linux_channel variable with `os_channel`
* Replace `container_linux_version` variable with `os_version`
* Please change values `stable`, `beta`, or `alpha` to `coreos-stable`,
`coreos-beta`, `coreos-alpha` (action required!)
2018-05-16 22:40:39 -07:00
Dalton Hubble 9ac7b0655f Add bare-metal network_ip_autodetection_method variable for multi-NIC
* Allow setting the Calico host IPv4 address autodetection method
* Use Calico's default "first-found" method to support single NIC
and bonded NIC nodes
* Allow methods like `can-reach=IP` or `interface=REGEX` for multi
NIC nodes
* https://docs.projectcalico.org/v3.1/reference/node/configuration#ip-autodetection-methods
2018-05-15 23:27:34 -07:00
Dalton Hubble 5eb11f5104 Allow Flatcar Linux os_image on AWS, rename os_channel
* Replace os_channel variable with os_image to align naming
across clouds. Users who set this option to stable, beta, or
alpha should now set os_image to coreos-stable, coreos-beta,
or coreos-alpha.
* Default os_image to coreos-stable. This continues to use
the most recent image from the stable channel as always.
* Allow Container Linux derivative Flatcar Linux by setting
os_image to `flatcar-stable`, `flatcar-beta`, `flatcar-alpha`
2018-05-12 11:41:58 -07:00
Michael Holt a5916da0e2 Update min AWS provider from v1.11 to v1.13 2018-05-02 15:16:03 -07:00
Dalton Hubble cc29530ba0 Allow preemptible workers on AWS via spot instances
* Add `worker_price` to allow worker spot instances. Defaults
to empty string for the worker autoscaling group to use regular
on-demand instances.
* Add `spot_price` to internal `workers` module for spot worker
pools
* Note: Unlike GCP `preemptible` workers, spot instances require
you to pick a bid price.
2018-04-29 13:31:17 -07:00
Dalton Hubble d81a091756 Switch Atomic docs to reference v1.10.2 tag 2018-04-28 00:27:23 -07:00
Dalton Hubble 32ddfa94e1 Update Kubernetes from v1.10.1 to v1.10.2
* https://github.com/kubernetes/kubernetes/releases/tag/v1.10.2
2018-04-28 00:27:00 -07:00
Dalton Hubble 86e5adf348 Set commit hash so tutorials work right now
* These modules are alpha, anyone wanting to try then
is probably fine using the raw sha
2018-04-26 09:08:06 -07:00
Dalton Hubble a89f25e31a Fix typo in announcement 2018-04-26 08:36:50 -07:00
Dalton Hubble 2e4bf4d7ae Add Fedora Atomic announcement and improve docs 2018-04-26 08:18:39 -07:00
Dalton Hubble b6a51d0b68 Add architecture docs on operating systems 2018-04-25 22:59:48 -07:00
Dalton Hubble d784b0fca6 Switch to quay.io/poseidon tagged system containers 2018-04-25 18:15:18 -07:00
Dalton Hubble cd913986df Write documentation for Fedora Atomic 2018-04-24 01:10:27 -07:00
Dalton Hubble af54efec28 Organize docs by operating system 2018-04-23 19:55:28 -07:00
Dalton Hubble ad2e4311d1 Switch GCP network lb to global TCP proxy lb
* Allow multi-controller clusters on Google Cloud
* GCP regional network load balancers have a long open
bug in which requests originating from a backend instance
are routed to the instance itself, regardless of whether
the health check passes or not. As a result, only the 0th
controller node registers. We've recommended just using
single master GCP clusters for a while
* https://issuetracker.google.com/issues/67366622
* Workaround issue by switching to a GCP TCP Proxy load
balancer. TCP proxy lb routes traffic to a backend service
(global) of instance group backends. In our case, spread
controllers across 3 zones (all regions have 3+ zones) and
organize them in 3 zonal unmanaged instance groups that
serve as backends. Allows multi-controller cluster creation
* GCP network load balancers only allowed legacy HTTP health
checks so kubelet 10255 was checked as an approximation of
controller health. Replace with TCP apiserver health checks
to detect unhealth or unresponsive apiservers.
* Drawbacks: GCP provision time increases, tailed logs now
timeout (similar tradeoff in AWS), controllers only span 3
zones instead of the exact number in the region
* Workaround in Typhoon has been known and posted for 5 months,
but there still appears to be no better alternative. Its
probably time to support multi-master and accept the downsides
2018-04-18 00:09:06 -07:00
@luke 490b628e2d Use relative image links to appear in Github markdown 2018-04-17 23:40:58 -07:00
Dalton Hubble 77c0a4cf2e Update Kubernetes from v1.10.0 to v1.10.1
* Use kubernetes-incubator/bootkube v0.12.0
2018-04-12 20:57:31 -07:00
Matt Dorn 2eaf858c5c
Update example BGPPeer manifest
Previous example may have been outdated. It resulted in `error: unable to recognize "example.yaml": no matches for /, Kind=bgpPeer` .

See https://docs.projectcalico.org/v3.0/reference/calicoctl/resources/bgppeer.
2018-04-09 23:23:18 -05:00
Dalton Hubble b8656fd74b Clarify bare-metal SSH instructions 2018-04-08 14:11:05 -07:00
Dalton Hubble 1cc043d1eb Update Kubernetes from v1.9.6 to v1.10.0 2018-03-30 22:14:07 -07:00
Dalton Hubble f8e9bfb1c0 Add disk_type variable for EBS volume type on AWS
* Change EBS volume type from `standard` ("prior generation)
 to `gp2`. Prometheus alerts are tuned for SSDs
* Other platforms have fast enough disks by default
2018-03-29 22:51:54 -07:00
Dalton Hubble fdb543e834 Add optional controller_type and worker_type vars on GCP
* Remove optional machine_type variable on Google Cloud
* Use controller_type and worker_type instead
2018-03-25 22:11:18 -07:00
Dalton Hubble 8d3d4220fd Add disk_size variable on Google Cloud 2018-03-25 22:04:14 -07:00
Dalton Hubble 38adb14bd2 Remove optional variable networking on Digital Ocean
* Calico isn't viable on Digital Ocean because their firewalls
do not support IP-IP protocol. Its not viable to run a cluster
without firewalls just to use Calico.
* Remove the caveat note. Don't allow users to shoot themselves
in the foot
2018-03-25 21:48:51 -07:00
Dalton Hubble e43cf9f608 Organize and cleanup variable descriptions 2018-03-25 21:44:43 -07:00
Dalton Hubble 455a4af27e Improve cluster definition examples in docs 2018-03-25 20:41:52 -07:00
Dalton Hubble 39876e455f Fix docs to reflect enforced provider versions 2018-03-25 11:34:39 -07:00
Dalton Hubble a04ef3919a Update Kubernetes from v1.9.5 to v1.9.6 2018-03-21 20:29:52 -07:00
Dalton Hubble 758c09fa5c Update Kubernetes from v1.9.4 to v1.9.5 2018-03-19 00:25:44 -07:00
Dalton Hubble 7f7bc960a6 Set default Google Cloud os_image to coreos-stable 2018-03-19 00:08:26 -07:00
Dalton Hubble 29108fd99d Improve changelog with migration links 2018-03-18 23:54:55 -07:00
Dalton Hubble 18d08de898 Add Container Linux Config snippet docs 2018-03-18 23:22:40 -07:00
Dalton Hubble d621512dd6 Promote AWS platform from beta to stable 2018-03-12 21:15:53 -07:00
Dalton Hubble 21f2cef12f Improve changelog, README, and index page 2018-03-12 20:58:02 -07:00
Dalton Hubble 931e311786 Update Kubernetes from v1.9.3 to v1.9.4 2018-03-12 18:07:50 -07:00
Dalton Hubble 2a4595eeee Add links to the charitable donations list 2018-03-11 14:51:40 -07:00
Dalton Hubble c112ee3829 Rename cluster_name to name in internal module
* Ensure consistency between AWS and GCP platforms
2018-03-03 17:52:01 -08:00
Dalton Hubble 45b556c08f Fix overly strict firewall for GCP "worker pools"
* Fix issue where worker firewall rules didn't apply to
additional workers attached to a GCP cluster using the new
"worker pools" feature (unreleased, #148). Solves host
connection timeouts and pods not being scheduled to attached
worker pools.
* Add `name` field to GCP internal worker module to represent
the unique name of of the worker pool
* Use `cluster_name` field of GCP internal worker module for
passing the name of the cluster to which workers should be
attached
2018-03-03 17:40:17 -08:00