Minion failes delivering messages using ActiveMQ

Hi!

We have problems with a minion that consumes memory and i’m guessing its why its crashes from time to time. We have increased the heapsize memory. It works better but still crashes sometimes.

java version “1.8.0_191”
Java™ SE Runtime Environment (build 1.8.0_191-b12)
Java HotSpot™ 64-Bit Server VM (build 25.191-b12, mixed mode)

minion: 25.2.1

Also a lot of “Timeout retrieving 'SnmpCollectors” in the log

In our log we can see the following:
I guess that it’s something about the queue?

2021-11-26T14:38:21,717 | ERROR | OpenNMS.Sink.AsyncDispatcher.Trap-Thread-3 | DefaultErrorHandler | 174 - org.apache.camel.camel-core - 2.19.1 | Failed delivery for (MessageId: ID-imonmin01-34923-1633569087686-1-38942734 on ExchangeId: ID-imonmin01-34923-1633569087686-1-38942569). Exhausted after delivery attempt: 1 caught: org.springframework.jms.IllegalStateException: The producer is closed; nested exception is javax.jms.IllegalStateException: The producer is closed

If you are collecting JMX-Minion on it, there should be graphs for all the ActiveMQ memory metrics and queues.

It is also a good idea to visit and tune your ActiveMQ memory values in opennms-activemq.xml, the stock values are extremely conservative.

          <!--
            The systemUsage controls the maximum amount of space the broker will
            use before slowing down producers. For more information, see:
            http://activemq.apache.org/producer-flow-control.html
            If using ActiveMQ embedded - the following limits could safely be used:
        -->
        <systemUsage>
            <systemUsage>
                <memoryUsage>
                    <memoryUsage limit="20 mb"/>
                </memoryUsage>
                <storeUsage>
                    <storeUsage limit="1 gb"/>
                </storeUsage>
                <tempUsage>
                    <tempUsage limit="100 mb"/>
                </tempUsage>
            </systemUsage>
        </systemUsage>

Current settings:

            <memoryUsage>
                <memoryUsage limit="1024 mb"/>
            </memoryUsage>
            <storeUsage>
                <storeUsage limit="2 gb"/>
            </storeUsage>
            <tempUsage>
                <tempUsage limit="500 mb"/>
            </tempUsage>
        </systemUsage>
    </systemUsage>

Increase a little bit? :wink:

Does the garbagecollection work at all?

I am working together with sojjan. The memory usage for the minion in question is constantly increasing until we restart the minion. The maximum heap size is set to 16G. Does anybody have an idea of what can be done to make the minion free up memory without restarting it?

What is consuming all the memory?

Take a heap dump and analyze it with something like https://heaphero.io/

Took a heap dump as you suggested and analyzed it with Eclipse Memory Analyzer. After running a “leak suspect” report it is telling me the following:

One instance of "org.opennms.core.ipc.sink.common.AsyncDispatcherImpl" loaded by "org.apache.felix.framework.BundleWiringImpl$BundleClassLoader @ 0x3c1b43ac0" occupies 1 894 466 896 (96,71%) bytes. The memory is accumulated in one instance of "java.util.concurrent.ConcurrentHashMap$Node[]" loaded by "<system class loader>".

Keywords
java.util.concurrent.ConcurrentHashMap$Node[]
org.opennms.core.ipc.sink.common.AsyncDispatcherImpl
org.apache.felix.framework.BundleWiringImpl$BundleClassLoader @ 0x3c1b43ac0

As someone that is not an ActiveMQ expert, it looks to me like ActiveMQ is having difficulty sending messages to the core, and so has messages backing up (in memory) on the topic, until it runs out of memory.

How’s the link between the two?

Thanks for the reply. I had a look with ping and also tried a nc -v 61616 and everything seems fine. No delay in responding, however I do see TimeoutExceptions or CompletionExceptions in the minion.log file:

grep -c java.util.concurrent.CompletionException /var/log/minion/minion.log
2743

That is just since last day. Don’t know if that has any relevance here though.