Monitoring Apache with the HTTP collector

Here’s a working example that can collect data on idle and busy workers from Apache web servers running mod_status.

Apache configuration

Inside your apache config file find the line that maches below and remove the comment.

ExtendedStatus On

Same as above find the entry that looks like this, remove the deny line and modify the Allow to say all, this will let any source IP call the status page, obviously you will want to lock this down but to get up and running this is a good place to start.

 <Location /server-status>
     SetHandler server-status
     Order deny,allow
     Allow from all

You will need to restart the apache web server for the changes to take effect.

mod_status output

Apache’s mod_status produces a “machine readable” output something like this:


Total Accesses: 2750703
Total kBytes: 17770979
CPULoad: .750477
Uptime: 34582
ReqPerSec: 79.5415
BytesPerSec: 526213
BytesPerReq: 6615.58
BusyWorkers: 16
IdleWorkers: 59
Scoreboard: ___R_____W_____W______W__.......................................__K__W_KW_R_________WK___

Some of these fields look useful. Total Accesses and Total kBytes are counters that are incremented throughout the lifetime of the server process. We could collect those and use them to provide request/s and kByte/s throughput data, but the counters will get reset to zero every time the server is restarted. If we store these in rrda’s of type COUNTER, there will be large spikes in the data as rrdtool or jrobin tries to deal with the fact that the counter has reset to zero. This is probably too much work to deal with. Fortunately the mod_status output does this calculation for us anyway with the ReqPerSec and BytesPerSec field. By collecting these as type GAUGE, we can avoid the counter reset issue. I’m going to develop a collection for BytesPerSec, BusyWorkers and IdleWorkers. We could also reasonably collect and graph ReqPerSec and BytesPerReq, which are both interesting metrics.

We can use the HTTP collector’s regular expression capabilities to collect these metrics and tuck them away in RRDs for graphing and thresholding.

Configuring service detection

First step is to define a service that we can use to identify nodes that support mod_status.

I assumed that any machine that responds with an HTTP 200 response code when asked for /server-status/ on port 80 is running mod_status.

Your OpenNMS install will use provisiond to assign the service to an interface for monitoring. This can be done by directly assigning the service in the requisition or by configuring a detector in the foreign source definition. For the service to become available for assignment, it must be configured in pollerd.

provisiond configuration

Foreign source definition





NOTE: the requisition editor doesn’t allow for parameters. In this case adding the Apache-Stats would check for port 80 based on the parameters in the poller configuration, so adding service as a detector is probably the better approach.

pollerd configuration


        <service name="Apache-Stats" interval="300000" user-defined="false" status="on">
            <parameter key="port" value="80" />
            <parameter key="timeout" value="5000" />
            <parameter key="retry" value="3" />
            <parameter key="collection" value="Apache-Stats" />

and at the end of the file, don’t forget to add:

    <monitor service="Apache-Stats" class-name="org.opennms.netmgt.poller.monitors.HttpMonitor"/>

This defines a new service, “Apache-Stats”. The name is not critical, but is does need to be consistent. The case-sensitive name gets used to match against service detection and data collection.

At this point, I restarted OpenNMS and rescanned a node that I knew offered the service to ensure that the service could be discovered. Sure enough the service showed up as “Not Monitored” in the appropriate node view.

data collection

It is important to be aware that in OPENNMS_1.3.2_RELEASE, http collection operates at a node level. There can therefore only be one instance of an individual http collection per node. As the [[HTTP_Collector| HTTP collector notes]] state, if more that one IP address on a node is found to have the same collectable HTTP service defined, only one address will be scheduled for collection.

There are two files to be configured here, collectd-configuration.xml and the new http-datacollection-config.xml file.


I defined this service in the default “example1” package, but you could put it in any appropriate package that included the nodes that you wish to collect on:

    <service name="Apache-Stats" interval="300000" user-defined="false" status="on" >
      <parameter key="http-collection" value="Apache-Stats" />
      <parameter key="retry" value="1" />
      <parameter key="timeout" value="2000" />

Note that the service name must match the service name in capsd-configuration.xml. The http-collection parameter value is used later on in http-datacollection-config.xml.

Further down the file, outside of the package definitions, I added a service to class mapping for the service:

        <collector service="Apache-Stats" class-name="org.opennms.netmgt.collectd.HttpCollector" />


I’m including the whole http-datacollection-config.xml file here. The important thing to note is that the http-collection name must match the http-collection value in collectd-configuration.xml or pollerd-configuration.xml (in this case Apache-Stats). I also removed the existing “doc-count” collection from http-datacollection-config.xml as there was no correctponding doc-count collection in my collectd-configuration.xml.

<?xml version="1.0" encoding="UTF-8"?>
    rrdRepository="/opt/OpenNMS/share/rrd/snmp/" >
  <http-collection name="Apache-Stats">
    <rrd step="300">
      <uri name="apache">
        <url path="/server-status/" query="auto"
             user-agent="Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/412 (KHTML, like Gecko) Safari/412"
             matches="(?s).*BytesPerSec:\s([0-9]+).*BusyWorkers:\s([0-9]+).*IdleWorkers:\s([0-9]+).*" response-range="100-399" >
          <attrib alias="BytesPerSec" match-group="1" type="gauge32"/>
          <attrib alias="BusyWorkers" match-group="2" type="gauge32"/>
          <attrib alias="IdleWorkers" match-group="3" type="gauge32"/>

The donkey work here is done by the uri element.

  • path="/server-status/" query="auto" tells the collector which URL to request (note that query="auto" just adds a “?auto” query parameter, making the mod_status output machine readable).
  • matches="(?s).*BytesPerSec:\s([0-9]+).*BusyWorkers:\s([0-9]+).*IdleWorkers:\s([0-9]+).*" uses back references to store the numbers following the words “BytesPerSec”, “BusyWorkers” and “IdleWorkers”.
  • The attrib elements store the values in the “BytesPerSec”, “BusyWorkers” and “IdleWorkers” back references into their respective RRDs with type gauge32.

Note that the (?s) at the beginnig of the regular expression is a “Mode Modifier”. It plays an important role in setting the regular expression pattern to “Dot-matches-all” mode. This allows a dot to match a new line as well as any other character. This is required as (as you can see from the example at the top of the page), the machine-readable output is spread across several lines.
After this, I restarted OpenNMS again to see that collection was taking place (and RRDs appearing).

Drawing the Graphs

HTTP collector output is at node level, and needs to be configured accordingly. At this point (OpenNMS_1.3.2_RELEASE) the collector can only collect a single instance of an HTTP collection per node. The rrd’s are therefore stored at node level within the rrd directory ($OPENNMS_HOME/share/rrd/snmp/<node_id>. This means that the graphs will be shown in the SNMP Node Data -> Node level performance data section of the Node’s resource graphs page. This also means that the report type needs to be defined as nodeSnmp under the report section of (see below).

Add the (to be defined) report to the list of reports at the top of the file:


Add the report definition to the bottom of the file: HTTP Workers
report.apache.workers.command=--title="Apache HTTP workers" \
    --vertical-label workers \
    DEF:BusyWorkers={rrd1}:BusyWorkers:AVERAGE \
    DEF:IdleWorkers={rrd2}:IdleWorkers:AVERAGE \
    LINE2:BusyWorkers#ff0000:"busy workers " \
    GPRINT:BusyWorkers:AVERAGE:"Avg  \\: %8.2lf %s" \
    GPRINT:BusyWorkers:MIN:"Min  \\: %8.2lf %s" \
    GPRINT:BusyWorkers:MAX:"Max  \\: %8.2lf %s\\n" \
    LINE2:IdleWorkers#00ff00:"idle workers " \
    GPRINT:IdleWorkers:AVERAGE:"Avg  \\: %8.2lf %s" \
    GPRINT:IdleWorkers:MIN:"Min  \\: %8.2lf %s" \
    GPRINT:IdleWorkers:MAX:"Max  \\: %8.2lf %s\\n"

Note that drawing the BytesPerSec graph is left as an exercise for the reader.

Wait a while and then enjoy your new graphs:


httpd-datacollection-config.xml Full

I modified your original collecd and capsd to stop false collections of servers that respond to a web server discovery but don’t have server-status, in addition to the above changes the below code will graph and collect all the data from the page for those looking for a cut and paste solution. Added Values to the above are color coded in red.

 <?xml version="1.0" encoding="UTF-8"?>
    rrdRepository="/opt/opennms/share/rrd/snmp/" >
 <http-collection name="Apache-Stats">
   <rrd step="300">
     <uri name="apache">
       <url path="/server-status/" query="auto"
            user-agent="Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/412 (KHTML, like Gecko) Safari/412"
            matches="(?s).*?Total\sAccesses:\s([0-9]+).*?Total\skBytes:\s([0-9]+).*?CPULoad:\s([0-9\.]+).*?Uptime:\s([0-9]+).*?ReqPerSec:\s([0-9\.]+).*?BytesPerSec:\s([0-9\.]+).*?BytesPerReq:\s([0-9\.]+).*?BusyWorkers:\s([0-9]+).*?IdleWorkers:\s([0-9]+).*" response-range="100-399" >
         <attrib alias="TotalAccesses" match-group="1" type="counter32"/>
         <attrib alias="TotalkBytes" match-group="2" type="counter32"/>
         <attrib alias="CPULoad" match-group="3" type="gauge32"/>
         <attrib alias="Uptime" match-group="4" type="gauge32"/>
         <attrib alias="ReqPerSec" match-group="5" type="gauge32"/>
         <attrib alias="BytesPerSec" match-group="6" type="gauge32"/>
         <attrib alias="BytesPerReq" match-group="7" type="gauge32"/>
         <attrib alias="BusyWorkers" match-group="8" type="gauge32"/>
         <attrib alias="IdleWorkers" match-group="9" type="gauge32"/>

Drawing the Graphs Full


##  Please add report definition in a new line to make it easier
##  for script based sanity checks

reports=apache.workers, \
apache.bytes, \
apache.uptime, \
apache.cpu, \
apache.access, \ 
apache.kbytes, \
apache.byteperreq, \
apache.reqpersec HTTP Workers
report.apache.workers.command=--title="Apache HTTP workers" \
 --vertical-label workers \
 DEF:BusyWorkers={rrd1}:BusyWorkers:AVERAGE \
 DEF:IdleWorkers={rrd2}:IdleWorkers:AVERAGE \
 COMMENT:"      " \
 LINE2:BusyWorkers#ff0000:"busy workers " \
 GPRINT:BusyWorkers:AVERAGE:"Avg\\: %7.2lf %s" \
 GPRINT:BusyWorkers:MIN:"Min\\: %7.2lf %s" \
 GPRINT:BusyWorkers:MAX:"Max\\: %7.2lf %s\\n" \
 COMMENT:"      " \
 LINE2:IdleWorkers#00ff00:"idle workers " \
 GPRINT:IdleWorkers:AVERAGE:"Avg\\: %7.2lf %s" \
 GPRINT:IdleWorkers:MIN:"Min\\: %7.2lf %s" \
 GPRINT:IdleWorkers:MAX:"Max\\: %7.2lf %s\\n" Bytes Per Second
report.apache.bytes.command=--title="Apache HTTP Bytes Per Second" \
 --vertical-label Bytes \
 DEF:BytesPerSec={rrd1}:BytesPerSec:AVERAGE \
 AREA:BytesPerSec#66CCFF: \
 COMMENT:"      " \
 LINE1:BytesPerSec#000000:"Bytes per second " \
 GPRINT:BytesPerSec:AVERAGE:"Avg\\: %7.2lf %s" \
 GPRINT:BytesPerSec:MIN:"Min\\: %7.2lf %s" \
 GPRINT:BytesPerSec:MAX:"Max\\: %7.2lf %s\\n" Uptime
report.apache.uptime.command=--title="Apache HTTP Uptime" \
 --vertical-label UpTime \
 --units-exponent 0 \
 DEF:Uptime={rrd1}:Uptime:AVERAGE \
 CDEF:timesec=Uptime,1,* \
 CDEF:timemin=timesec,60,/ \
 CDEF:timehour=timemin,60,/ \
 CDEF:timeday=timehour,24,/ \
 AREA:timehour#CC99FF: \
 COMMENT:"      " \
 LINE1:timehour#000000:"Hours" \
 GPRINT:timehour:MIN:"Min\\: %7.2lf" \
 GPRINT:timehour:MAX:"Max\\: %7.2lf\\n" \
 AREA:timeday#33FF00: \
 COMMENT:"      " \
 LINE1:timeday#33FF00:"Days" \
 GPRINT:timeday:MIN:"Min\\: %7.2lf" \
 GPRINT:timeday:MAX:"Max\\: %7.2lf\\n" Cpu Load
report.apache.cpu.command=--title="Apache Cpu Load" \
 --vertical-label Load \
 DEF:CPULoad={rrd1}:CPULoad:AVERAGE \
 AREA:CPULoad#999999: \
 COMMENT:"      " \
 LINE1:CPULoad#000000:"Load" \
 GPRINT:CPULoad:AVERAGE:"Avg\\: %7.2lf%%" \
 GPRINT:CPULoad:MIN:"Min\\: %7.2lf%%" \
 GPRINT:CPULoad:MAX:"Max\\: %7.2lf%%\\n" Accesses
report.apache.access.command=--title="Apache Total Accesses" \
 --vertical-label Number \
 DEF:TotalAccesses={rrd1}:TotalAccesses:AVERAGE \
 AREA:TotalAccesses#FF6600: \
 COMMENT:"      " \
 LINE1:TotalAccesses#000000:"Total Accesses" \
 GPRINT:TotalAccesses:AVERAGE:"Avg  \\: %7.2lf %s" \
 GPRINT:TotalAccesses:MIN:"Min\\: %7.2lf %s" \
 GPRINT:TotalAccesses:MAX:"Max\\: %7.2lf %s\\n" Total kBytes
report.apache.kbytes.command=--title="Apache Total kBytes" \
 --vertical-label kBytes \
 DEF:TotalkBytes={rrd1}:TotalkBytes:AVERAGE \
 AREA:TotalkBytes#00cc00: \
 COMMENT:"      " \
 LINE1:TotalkBytes#000000:"Total kBytes" \
 GPRINT:TotalkBytes:AVERAGE:"Avg\\: %7.2lf %s" \
 GPRINT:TotalkBytes:MIN:"Min\\: %7.2lf %s" \
 GPRINT:TotalkBytes:MAX:"Max\\: %7.2lf %s\\n" Bytes Per Request
report.apache.byteperreq.command=--title="Apache Bytes Per Request" \
 --vertical-label Bytes \
 DEF:BytesPerReq={rrd1}:BytesPerReq:AVERAGE \
 AREA:BytesPerReq#9999CC: \
 COMMENT:"      " \
 LINE1:BytesPerReq#000000:"Bytes Per Request" \
 GPRINT:BytesPerReq:AVERAGE:"Avg\\: %7.2lf %s" \
 GPRINT:BytesPerReq:MIN:"Min\\: %7.2lf %s" \
 GPRINT:BytesPerReq:MAX:"Max\\: %7.2lf %s\\n" Requests Per Second
report.apache.reqpersec.command=--title="Apache Requests Per Second" \
 --vertical-label Requests \
 DEF:ReqPerSec={rrd1}:ReqPerSec:AVERAGE \
 AREA:ReqPerSec#009999: \
 COMMENT:"      " \
 LINE1:ReqPerSec#000000:"Requests Per Second" \
 GPRINT:ReqPerSec:AVERAGE:"Avg\\: %7.2lf %s" \
 GPRINT:ReqPerSec:MIN:"Min\\: %7.2lf %s" \
 GPRINT:ReqPerSec:MAX:"Max\\: %7.2lf %s\\n"