mirror of
				https://github.com/puppetmaster/typhoon.git
				synced 2025-10-20 22:15:58 +02:00 
			
		
		
		
	Add Prometheus alert rule for inactive md devices
* node-exporter exposes metrics to Prometheus about total and active md devices (e.g. disks in mdadm RAID arrays) * Add alert that fires when a RAID disk fails or becomes inactive for another reason
This commit is contained in:
		| @@ -496,6 +496,13 @@ data: | ||||
|         annotations: | ||||
|           description: device {{$labels.device}} on node {{$labels.instance}} is running | ||||
|             full within the next 2 hours (mounted at {{$labels.mountpoint}}) | ||||
|       - alert: InactiveRAIDDisk | ||||
|         expr: node_md_disks - node_md_disks_active > 0 | ||||
|         for: 10m | ||||
|         labels: | ||||
|           severity: warning | ||||
|         annotations: | ||||
|           description: '{{$value}} RAID disk(s) on node {{$labels.instance}} are inactive' | ||||
|   prometheus.rules.yaml: | | ||||
|     groups: | ||||
|     - name: prometheus.rules | ||||
|   | ||||
		Reference in New Issue
	
	Block a user