While developing the correlation engine and using an algorithm that depends on the timing of events, we ran into a case where we expected events to be correlated, but they were not.
After analyzing the events, we found that the timestamps associated with the Syslog messages differed from the timestamps associated with the SNMP traps. Events associated with Syslog are tagged with the time parsed from the Syslog message whereas events associated with SNMP traps are tagged with the time at which they were received.
In this case the Syslog messages were consistently several minutes off.
To help identify these cases I’ve created a set of Drools rules that can be used to identify this problem and trigger an alarm when it is detected.