Adjust Google Cloud worker health checks to use kube-proxy healthz

* Change the workers managed instance group to health check nodes
via HTTP probe of the kube-proxy port 10256 /healthz endpoints
* Advantages: kube-proxy is a lower value target (in case there
were bugs in firewalls) that Kubelet, its more representative than
health checking Kubelet (Kubelet must run AND kube-proxy Daemonset
must be healthy), and its already used by kube-proxy liveness probes
(better discoverability via kubectl or alerts on pods crashlooping)
* Another motivator is that GKE clusters also use kube-proxy port
10256 checks to assess node health
This commit is contained in:
Dalton Hubble
2022-08-17 20:50:52 -07:00
parent 760b4cd5ee
commit e87d5aabc3
5 changed files with 27 additions and 25 deletions

View File

@ -196,13 +196,13 @@ resource "google_compute_firewall" "allow-ingress" {
target_tags = ["${var.cluster_name}-worker"]
}
resource "google_compute_firewall" "google-kubelet-health-checks" {
name = "${var.cluster_name}-kubelet-health"
resource "google_compute_firewall" "google-worker-health-checks" {
name = "${var.cluster_name}-worker-health"
network = google_compute_network.network.name
allow {
protocol = "tcp"
ports = [10250]
ports = [10256]
}
# https://cloud.google.com/compute/docs/instance-groups/autohealing-instances-in-migs