Troubleshoot performance data collection from SNMP agents

Problem

SNMP datacollection configuration is not working.

Diagnosis

Missing SNMP service

SNMP service on interface

The SNMP service should be added to the node either directly on the requisition or through a detector.

SNMP service doesn’t come online

The SNMP credentials should be configured correctly. It’s possible to configure it using the GUI or directly in snmp-config.xml.

RRD file is not created

There a some general things which can be checked.

collectd and datacollection config needs matching collection names

  • collectd-configuration.xml: <parameter key="collection" value="NAME"/>
  • datacollection-config.xml: <snmp-collection name="NAME"/>

dataCollectionGoup has to match datacollection-group name

  • datacollection-config.xml: <include-collection dataCollectionGroup="NAME"/>
  • ${OPENNMS_HOME}/etc/datacollection/someDatacollectionFile.xml: <datacollection-group name=NAME>

sysObjectId has to match sysOidMask

In ${OPENMS_HOME}/etc/datacollection/someDatacollectionFile.xml the sysOidMask in systemDef has to match the sysObjectId of the node.

You can get these using: snmpwalk -c COMMUNITY -v2c IP -On .1.3.6.1.2.1.1.2.0

sysoid vs sysoidMask in datacollection config files

sysoid matches exactly against the SysObjectID of a node:

<sysoid>.1.3.6.1.4.1.19746.3.1.36</sysoid>

sysoidMask matches against everything starting with the defined OID:

<sysoidMask>.1.3.6.1.4.1.19746.</sysoidMask>

Collectd filter mismatch

The filter configuration in collectd-configuration.xml does not match your node.

Multiple primary SNMP interfaces

If a node has multiple interfaces and more than one is marked with mgnt type P it can happen, that eg. Interface metrics are not being collected.

Since OpenNMS 26.1.1 provisiond does not accept definitions with multiple primary interfaces. See issue NMS-12605.

SNMP v2 in v1

Some buggy SNMP agents fail to exclude Counter64 objects from view when responding to SNMPv1 requests (as mandated by RFC3584 § 4.2.2.1). To relax handling of v1 responses to permit Counter64 varbinds rather than discarding them as ill-formed (per the same RFC), set this property to true.

2 Likes

To validate that the SNMP settings are configured properly, I like to use snmp:show-config 192.168.1.1 from the Karaf Shell

It’s also possible to perform SNMP walks from the shell, which use the configured SNMP settings, and the same SNMP client settings that are used when the data collection is performed: snmp:walk 192.168.1.1 .1.3.6.1.2.1.1.2

You can also invoke the SNMP collector manually and verify the results using: collection:collect --node 1 org.opennms.netmgt.collectd.SnmpCollector 192.168.1.1. Note that you must specify the node id, and the IP address specified must be on that node.

2 Likes

That link to a specific line number is obviously wrong.
Better reference the actual setting: in opennms.properties
org.opennms.snmp.snmp4j.allowSNMPv2InV1=true

Adding another one

Check for SNMP timeouts.
To do so, enable debug logging for collectd (in log4j2.xml set the for <KeyValuePair key=“collectd” to DEBUG instead of WARN.

Some low end SNMP devices really have trouble answering an SNMP request/walk in time. Could take several minutes. Default timeout in OpenNMS is quite low (1800 milliseconds?).

If that’s the case, raise the timeout in snmp-config.xml for the definition your specific device falls under.
e.g.

<definition timeout="30000">   # that's 30 seconds
      <range begin="10.160.8.10" end="10.160.8.18"/>
</definition>

Or you can do it globally:
e.g.

<snmp-config xmlns="http://xmlns.opennms.org/xsd/config/snmp" version="v2c" read-community="public" timeout="3000" retry="1">