mirror of
https://github.com/puppetmaster/typhoon.git
synced 2024-12-24 04:19:33 +01:00
Add Prometheus alert rule for inactive md devices
* node-exporter exposes metrics to Prometheus about total and active md devices (e.g. disks in mdadm RAID arrays) * Add alert that fires when a RAID disk fails or becomes inactive for another reason
This commit is contained in:
parent
3352388fe6
commit
84d6cfe7b3
@ -496,6 +496,13 @@ data:
|
||||
annotations:
|
||||
description: device {{$labels.device}} on node {{$labels.instance}} is running
|
||||
full within the next 2 hours (mounted at {{$labels.mountpoint}})
|
||||
- alert: InactiveRAIDDisk
|
||||
expr: node_md_disks - node_md_disks_active > 0
|
||||
for: 10m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
description: '{{$value}} RAID disk(s) on node {{$labels.instance}} are inactive'
|
||||
prometheus.rules.yaml: |
|
||||
groups:
|
||||
- name: prometheus.rules
|
||||
|
Loading…
Reference in New Issue
Block a user