Custom thresholding issue

I have a requirement to have the ability to allow finer grained filesystem space thresholding.

We currently have a 80/warning, 85/minor, 90/major, 95/critical thresholding config that is applied to all filesystems, but this doesn’t allow for server/filesystem specific variations, eg on one system we need to ensure 40% available free space in one filesystem to allow space for java heap dumps, and then there are other systems, which have static content where 90% utilisation is normal, so I’ve been developing a custom monitor to try and address this.

What I have come up with is a script that builds an XML file for each server containing current filesystem utilisation statistics, and the warning/minor/major/critical utilisation levels for each filesystem pulled from a configuration file.

I currently trigger the threshold by using an expression that divides current usage by the threshold, with a value of 1, but the issue with this is that the notification looses the utilisation percentage and threshold:

  <expression description="Trigger an alert when the percentage of disk space used reaches or goes above P4 threshold for two consecutive measurement intervals" 
              type="high"
              ds-type="FilesystemMon"
              value="1.0"
              rearm="0.99"
              trigger="2"
              ds-label="fsName"
              triggeredUEI="uei.opennms.org/ABBCS/highdiskP4-trigger" 
              rearmedUEI="uei.opennms.org/ABBCS/highdiskP4-rearm"
              expression="fsPctUsed / fsThreshP4">
  </expression>

Ideally I’d like to be able to specify the threshold by having a variable reference in the “value” field of the threshold definition which would allow a datasource value to be referenced, so that I could have:

ds-name="fsPctUsed"
value="fsThreshP4"

which would be the ideal situation. However, at present I don’t think I can reference a datasource value in the “value” field of a threshold, so I’m wondering what is the best way to get some of those values used in the expression into the notification.

I also have a requirement to customise the ticketing system queue a given filesystems alerts are assigned to, and I have this info available in the XML file, so if I could get that string value into the notification, that would be great too as I could then get my ticketing system integration script to utilise that info.

One possible way I could do this would be to set up the notification with a custom notification path, but if there is a better way to achieve the end result, I’m open to ideas. (I don’t want to re-invent the wheel if I don’t have to…)

Regards,
John

I don’t really have an answer for you as I don’t do custom thresholding or any ticketing. But I wanted to make a note that this seems like something that using Meta-Data on a node could make Threshold definition more dynamic.

1 Like

Thanks. I’ll look into that next week.

I don’t think that thresholds can utilize metadata at this point. More of a note to the devs to see if it is possible.

1 Like

Not possible yet. Sounds like a good use though: Use meta-data into both the thresholding expressions (filesystem variations) and the events that are generated by thresholding (ticket system queue).

We’re currently working on some updates to thresholding now for Horizon 25. We’ll consider this too.

3 Likes

That sounds great. In the meantime I’ve got my customizable thresholds and notification working the way I want. The team to notify is also customisable per filesystem. My notification command replaces the message with the relevant values