Mutual OpenNMS Monitoring

how-to
#1

Having only one monitoring system for the whole company can be tricky.
Not only the classical single point of failure, also geographical separations, fetching alarm data Grafana boards or handling network traffic from all your locations/networks and fighting with firewalls can be difficult.

So you might be interested in separating your monitoring into single instances to monitor only the local infrastructure.

The question is now: How to verify, that all OpenNMS instances are running fine?
Of course, a nodeDown alarm will work. But it’s a pretty basical way to get notified.

The answer is simple. We can use the OpenNMS-JVM service. It is the best indicator that OpenNMS is running (and hopefully fine).

In a default installation OpenNMS is configured to monitor itself on localhost. You have to add the service to the loopback interface 127.0.0.1 to get OpenNMS metrics.

Hint: Check out the OpenNMS JVM metrics dashboard.

In this scenario we want to change the configs to make the service available to other OpenNMS instances.

Default configurations

collectd-configuration.xml

<service name="OpenNMS-JVM" interval="300000" user-defined="false" status="on">
   <parameter key="port" value="18980"/>
   <parameter key="retry" value="2"/>
   <parameter key="timeout" value="3000"/>
   <parameter key="rrd-base-name" value="java"/>
   <parameter key="collection" value="jsr160"/>
   <parameter key="thresholding-enabled" value="true"/>
   <parameter key="ds-name" value="opennms-jvm"/>
   <parameter key="friendly-name" value="opennms-jvm"/>
</service>

poller-configuration.xml

<service name="OpenNMS-JVM" interval="300000" user-defined="false" status="on">
   <parameter key="port" value="18980"/>
   <parameter key="retry" value="2"/>
   <parameter key="timeout" value="3000"/>
   <parameter key="rrd-repository" value="${install.share.dir}/rrd/response"/>
</service>

Required settings/changes for remote access

collectd-configuration.xml

factory, username and password parameter is required.

<service name="OpenNMS-JVM" interval="300000" user-defined="false" status="on">
   <parameter key="port" value="18980"/>
   <parameter key="factory" value="PASSWORD_CLEAR"/>
   <parameter key="username" value="YOUR_USERNAME"/>
   <parameter key="password" value="YOUR_PASSWORD"/>
   <parameter key="retry" value="2"/>
   <parameter key="timeout" value="3000"/>
   <parameter key="rrd-base-name" value="java"/>
   <parameter key="collection" value="jsr160"/>
   <parameter key="thresholding-enabled" value="true"/>
   <parameter key="ds-name" value="opennms-jvm"/>
   <parameter key="friendly-name" value="opennms-jvm"/>
</service>

poller-configuration.xml

factory, username and password parameter is required.

<service name="OpenNMS-JVM" interval="300000" user-defined="false" status="on">
   <parameter key="port" value="18980"/>
   <parameter key="factory" value="PASSWORD_CLEAR"/>
   <parameter key="username" value="YOUR_USERNAME"/>
   <parameter key="password" value="YOUR_PASSWORD"/>
   <parameter key="retry" value="2"/>
   <parameter key="timeout" value="3000"/>
   <parameter key="rrd-repository" value="/opt/opennms/share/rrd/response"/>
</service>

opennms.conf

Usually this file does not exist in $OPENNMS_HOME/etc/ and has to be created manually.

Usual installation

In a usual OpenNMS installation these options are required to make JMX remotely accessible.

Servers hostname is required.

# Configure Remote JMX
ADDITIONAL_MANAGER_OPTIONS="$ADDITIONAL_MANAGER_OPTIONS -Dcom.sun.management.jmxremote.port=18980"
ADDITIONAL_MANAGER_OPTIONS="$ADDITIONAL_MANAGER_OPTIONS -Dcom.sun.management.jmxremote.rmi.port=18980"
ADDITIONAL_MANAGER_OPTIONS="$ADDITIONAL_MANAGER_OPTIONS -Dcom.sun.management.jmxremote.local.only=false"
ADDITIONAL_MANAGER_OPTIONS="$ADDITIONAL_MANAGER_OPTIONS -Dcom.sun.management.jmxremote.ssl=false"
ADDITIONAL_MANAGER_OPTIONS="$ADDITIONAL_MANAGER_OPTIONS -Dcom.sun.management.jmxremote.authenticate=true"

# Listen on all interfaces (for JMX)
ADDITIONAL_MANAGER_OPTIONS="$ADDITIONAL_MANAGER_OPTIONS -Dopennms.poller.server.serverHost=0.0.0.0"

# Accept remote RMI connections on this interface (for JMX)
ADDITIONAL_MANAGER_OPTIONS="$ADDITIONAL_MANAGER_OPTIONS -Djava.rmi.server.hostname=HOSTNAME"

Docker OpenNMS

When OpenNMS is dockered you have to set these option like described here:

Also don’t forget to expose port 18980 in the Docker config!

JAVA_OPTS=-Dcom.sun.management.jmxremote.port=18980 -Dcom.sun.management.jmxremote.rmi.port=18980 -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=true -Dopennms.poller.server.serverHost=0.0.0.0 -Djava.rmi.server.hostname=HOSTNAME

jmxremote.access

This file is less know but existent in all installations.
It’s been used to configure JMX ACLs.

Default setup looks like this:

[root@464cd274e1ff opennms]# cat etc/jmxremote.access 
admin	readwrite
jmx	    readonly

It doesn’t need any changes.

Monitoring User

User jmx is not existing in database per default.
Create this user using the GUI. Also the user needs the role ROLE_JMX assigned.


That’s basically everything we need.

Adding OpenNMS Nodes

Add all instances to each OpenNMS server and add the service OpenNMS-JVM to the network interface (instead of localhost like you may did previously).

Alarms

If an OpenNMS won’t run you will get a nodeLostService alarm for service OpenNMS-JVM, but there is also the possibility to create thresholds on OpenNMS internal metrics like active poller or collector threads or task completion ratio.
Have a look at the Grafana dashboard which was mentioned above to get an idea which metrics are interesting for you.

1 Like