We have problems with a minion that consumes memory and i’m guessing its why its crashes from time to time. We have increased the heapsize memory. It works better but still crashes sometimes.
java version “1.8.0_191”
Java™ SE Runtime Environment (build 1.8.0_191-b12)
Java HotSpot™ 64-Bit Server VM (build 25.191-b12, mixed mode)
Also a lot of “Timeout retrieving 'SnmpCollectors” in the log
In our log we can see the following:
I guess that it’s something about the queue?
2021-11-26T14:38:21,717 | ERROR | OpenNMS.Sink.AsyncDispatcher.Trap-Thread-3 | DefaultErrorHandler | 174 - org.apache.camel.camel-core - 2.19.1 | Failed delivery for (MessageId: ID-imonmin01-34923-1633569087686-1-38942734 on ExchangeId: ID-imonmin01-34923-1633569087686-1-38942569). Exhausted after delivery attempt: 1 caught: org.springframework.jms.IllegalStateException: The producer is closed; nested exception is javax.jms.IllegalStateException: The producer is closed
If you are collecting
JMX-Minion on it, there should be graphs for all the ActiveMQ memory metrics and queues.
It is also a good idea to visit and tune your ActiveMQ memory values in
opennms-activemq.xml, the stock values are extremely conservative.
The systemUsage controls the maximum amount of space the broker will
use before slowing down producers. For more information, see:
If using ActiveMQ embedded - the following limits could safely be used:
<memoryUsage limit="20 mb"/>
<storeUsage limit="1 gb"/>
<tempUsage limit="100 mb"/>
Does the garbagecollection work at all?
I am working together with sojjan. The memory usage for the minion in question is constantly increasing until we restart the minion. The maximum heap size is set to 16G. Does anybody have an idea of what can be done to make the minion free up memory without restarting it?
What is consuming all the memory?
Take a heap dump and analyze it with something like https://heaphero.io/
Took a heap dump as you suggested and analyzed it with Eclipse Memory Analyzer. After running a “leak suspect” report it is telling me the following:
One instance of "org.opennms.core.ipc.sink.common.AsyncDispatcherImpl" loaded by "org.apache.felix.framework.BundleWiringImpl$BundleClassLoader @ 0x3c1b43ac0" occupies 1 894 466 896 (96,71%) bytes. The memory is accumulated in one instance of "java.util.concurrent.ConcurrentHashMap$Node" loaded by "<system class loader>".
org.apache.felix.framework.BundleWiringImpl$BundleClassLoader @ 0x3c1b43ac0
As someone that is not an ActiveMQ expert, it looks to me like ActiveMQ is having difficulty sending messages to the core, and so has messages backing up (in memory) on the topic, until it runs out of memory.
How’s the link between the two?
Thanks for the reply. I had a look with ping and also tried a nc -v 61616 and everything seems fine. No delay in responding, however I do see TimeoutExceptions or CompletionExceptions in the minion.log file:
grep -c java.util.concurrent.CompletionException /var/log/minion/minion.log
That is just since last day. Don’t know if that has any relevance here though.
Works better now after upgrading java to openjdk 11.0.14 2022-01-18