Monitor APC UPS Units

How to: Monitoring APC UPS

In OpenNMS there are a lot of datacollections for APC UPS. It is also possible to monitor battery, online status and the amount of bad battery packs with the OpenNMS SNMP Monitor.

The following configuration assume the PowerNet MIB is supported.

upsBasicBatteryStatus          .1.3.6.1.4.1.318.1.1.1.2.1.1.0
upsAdvBatteryReplaceIndicator  .1.3.6.1.4.1.318.1.1.1.2.2.4.0
upsAdvBatteryNumOfBattPacks    .1.3.6.1.4.1.318.1.1.1.2.2.5.0
upsAdvBatteryNumOfBadBattPacks .1.3.6.1.4.1.318.1.1.1.2.2.6.0
upsBasicOutputStatus           .1.3.6.1.4.1.318.1.1.1.4.1.1.0

Additional data collection

If you want to collect and store data for the above metrics relating to overall ups and battery status, add the following to your $OPENNMS_HOME/etc/datacollection/apc.xml file.

<mibObj oid=".1.3.6.1.4.1.318.1.1.1.2.2.4" instance="0" alias="apcbattrepl" type="integer" />
<mibObj oid=".1.3.6.1.4.1.318.1.1.1.2.2.5" instance="0" alias="apcbattcount" type="integer" />
<mibObj oid=".1.3.6.1.4.1.318.1.1.1.2.2.6" instance="0" alias="apcbadbatt" type="integer" />
<mibObj oid=".1.3.6.1.4.1.318.1.1.1.4.1.1" instance="0" alias="apcstatus" type="integer" />

Monitoring with Pollerd

To monitor this OIDs you have to create some monitors in $OPENNMS_HOME/etc/poller-configuration.xml

  <!-- APC UPS -->
  <service name="APC-Battery-Status" interval="300000" user-defined="false" status="on">
      <parameter key="retry" value="5"/>
      <parameter key="timeout" value="5950"/>
      <parameter key="port" value="161"/>
      <parameter key="oid" value=".1.3.6.1.4.1.318.1.1.1.2.1.1.0"/>
      <parameter key="operator" value="="/>
      <parameter key="operand" value="2"/>
      <parameter key="reason-template" value="APC battery status is not normal. The state should be \
              batteryNormal(${operand}) the observed value is ${observedValue}. Please check your APC event log. \
              Syntax: unknown(1), batteryNormal(2), batteryLow(3)"/>
  </service>
  <service name="APC-Battery-Replace-Indicator" interval="300000" user-defined="false" status="on">
      <parameter key="retry" value="5"/>
      <parameter key="timeout" value="5950"/>
      <parameter key="port" value="161"/>
      <parameter key="oid" value=".1.3.6.1.4.1.318.1.1.1.2.2.4.0"/>
      <parameter key="operator" value="="/>
      <parameter key="operand" value="1"/>
      <parameter key="reason-template" value="APC UPS battery needs replacing. The state should be \
              noBatteryNeedsReplacing(${operand}) the observed value is ${observedValue}. Please check your APC event log. \
              Syntax: noBatteryNeedsReplacing(1), batteryNeedsReplacing(2)"/>
  </service>
  <service name="APC-Battery-Bad-Packs-Count" interval="300000" user-defined="false" status="on">
      <parameter key="retry" value="5"/>
      <parameter key="timeout" value="5950"/>
      <parameter key="port" value="161"/>
      <parameter key="oid" value=".1.3.6.1.4.1.318.1.1.1.2.2.6.0"/>
      <parameter key="operator" value="="/>
      <parameter key="operand" value="0"/>
      <parameter key="reason-template" value="APC UPS bad battery pack is detected. The state should be \
              badPacks(${operand}) the observed value is ${observedValue}. Please check your APC event log. \
              Syntax: Amount of bad battery packs"/>
  </service>
  <service name="APC-Output-Status" interval="300000" user-defined="false" status="on">
      <parameter key="retry" value="5"/>
      <parameter key="timeout" value="5950"/>
      <parameter key="port" value="161"/>
      <parameter key="oid" value=".1.3.6.1.4.1.318.1.1.1.4.1.1.0"/>
      <parameter key="operator" value="="/>
      <parameter key="operand" value="2"/>
      <parameter key="reason-template" value="APC UPS output status is not online. The state should be \
              onLine(${operand}) the observed value is ${observedValue}. Please check your APC event log. \
              Syntax: unknown(1), onLine(2), onBattery(3), onSmartBoost(4), timedSleeping(5), softwareBypass(6), off(7), \  
              rebooting(8), switchedBypass(9), hardwareFailureBypass(10), sleepingUntilPowerReturn(11), onSmartTrim(12)"/>
  </service>
  
  <monitor service="APC-Battery-Status" class-name="org.opennms.netmgt.poller.monitors.SnmpMonitor"/>
  <monitor service="APC-Battery-Replace-Indicator" class-name="org.opennms.netmgt.poller.monitors.SnmpMonitor"/>
  <monitor service="APC-Battery-Bad-Packs-Count" class-name="org.opennms.netmgt.poller.monitors.SnmpMonitor"/>
  <monitor service="APC-Output-Status" class-name="org.opennms.netmgt.poller.monitors.SnmpMonitor"/>

The monitored values are interpreted as follow:

upsBasicBatteryStatus .1.3.6.1.4.1.318.1.1.1.2.1.1.0

  • unknown(1)
  • batteryNormal(2) - UP
  • batteryLow(3)

upsAdvBatteryReplaceIndicator .1.3.6.1.4.1.318.1.1.1.2.2.4.0

  • noBatteryNeedsReplacing(1) - UP
  • batteryNeedsReplacing(2)

upsAdvBatteryNumOfBadBattPacks .1.3.6.1.4.1.318.1.1.1.2.2.6.0

  • number of bad battery packs, should be 0 - UP

upsBasicOutputStatus .1.3.6.1.4.1.318.1.1.1.4.1.1.0

  • unknown(1)
  • onLine(2) - UP
  • onBattery(3)
  • onSmartBoost(4)
  • timedSleeping(5)
  • softwareBypass(6)
  • off(7)
  • rebooting(8)
  • switchedBypass(9)
  • hardwareFailureBypass(10)
  • sleepingUntilPowerReturn(11)
  • onSmartTrim(12)

If you want to track an ambient temperature probe on your UPS, see How to: APC Ambient Temperature Probe

1 Like