KarafStartupMonitor waiting for loading KarafHealthService: Could not start daemon

Problem:
After update Ubuntu 16 to 18, openms 27 does not start anymore.
In karafStartupMonitor.log you can read, the KarafHealthService could not start. (waits 5 minutes for it)

Expected outcome:
Starting OpenNMS

OpenNMS version:
27.2.0 and later 28.0.2

OS/Java

  • Ubuntu 18.04.5 LTS
  • openjdk 11.0.11 2021-04-20
  • OpenJDK Runtime Environment (build 11.0.11+9-Ubuntu-0ubuntu2.18.04)
  • OpenJDK 64-Bit Server VM (build 11.0.11+9-Ubuntu-0ubuntu2.18.04, mixed mode, sharing)

Other relevant data:
karafStartupMonitor.log

2021-09-08 16:18:28,896 INFO  [Main] o.o.n.d.AbstractSpringContextJmxServiceDaemon: karafStartupMonitor initialization complete.
2021-09-08 16:18:37,752 INFO  [Main] o.o.n.d.AbstractSpringContextJmxServiceDaemon: karafStartupMonitor starting.
2021-09-08 16:18:37,752 DEBUG [Main] o.o.n.d.AbstractSpringContextJmxServiceDaemon: SPRING: thread.classLoader=java.net.FactoryURLClassLoader@133314b
2021-09-08 16:18:37,752 INFO  [Main] o.o.f.k.h.d.KarafStartupMonitor: KarafStartupMonitor is starting.
2021-09-08 16:18:37,753 INFO  [Main] o.o.f.k.h.d.KarafStartupMonitor: Waiting for loading of org.opennms.features.karaf.health.service.KarafHealthService, will block startup until service is available.
2021-09-08 16:23:37,801 ERROR [Main] o.o.n.d.AbstractSpringContextJmxServiceDaemon: Could not start daemon: java.lang.IllegalStateException: KarafStartupMonitor: It seems Karaf can't be started properly. This is bad, will fail startup.
What can you do about this?
1.) check in logs/karaf.log for problems
2.) clear the 'data' folder - it contains Karaf's cache
3.) run the script bin/fix-karaf-setup.sh

Solution attempts

  1. logs/karaf.log does not exist
  2. clear ‘data’ folder - no success, start is still not possible
  3. run the script bin/fix-karaf-setup.sh - no success, start is still not possible
  4. update OpenNMS from 27 to 28, - no success, start is still not possible

What else can I do?

DIsable the karafStartupMonitor in service-configuration.xml and see what fails when you start the application.

Ok, now the error comes from Telemetryd (manager.log):

2021-09-08 17:11:07,576 DEBUG [Main] o.o.n.v.Invoker: Invoking start on object OpenNMS:Name=Telemetryd
2021-09-08 17:15:38,507 ERROR [Main] o.o.n.v.Invoker: An error occurred invoking operation start on MBean OpenNMS:Name=Telemetryd
javax.management.RuntimeMBeanException: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'consumer': Invocation of init method failed; nested exception is java.lang.Exception: No adapter found for class: org.opennms.netmgt.telemetry.protocols.netflow.adapter.netflow5.Netflow5Adapter
        at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.rethrow(DefaultMBeanServerInterceptor.java:829) ~[?:?]
        at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.rethrowMaybeMBeanException(DefaultMBeanServerInterceptor.java:842) ~[?:?]
        at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:811) ~[?:?]
        at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801) ~[?:?]
        at org.opennms.netmgt.vmmgr.Invoker.invoke(Invoker.java:277) [org.opennms.core.daemon-28.0.2.jar:?]
        at org.opennms.netmgt.vmmgr.Invoker.invokeMethods(Invoker.java:206) [org.opennms.core.daemon-28.0.2.jar:?]
        at org.opennms.netmgt.vmmgr.Starter.start(Starter.java:157) [org.opennms.core.daemon-28.0.2.jar:?]
        at org.opennms.netmgt.vmmgr.Starter.startDaemon(Starter.java:95) [org.opennms.core.daemon-28.0.2.jar:?]
        at org.opennms.netmgt.vmmgr.Controller.start(Controller.java:173) [org.opennms.core.daemon-28.0.2.jar:?]
        at org.opennms.netmgt.vmmgr.Controller.main(Controller.java:150) [org.opennms.core.daemon-28.0.2.jar:?]
        at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
        at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
        at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
        at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
        at org.opennms.bootstrap.Bootstrap$4.run(Bootstrap.java:531) [opennms_bootstrap.jar:?]
        at java.lang.Thread.run(Thread.java:829) [?:?]
Caused by: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'consumer': Invocation of init method failed; nested exception is java.lang.Exception: No adapter found for class: org.opennms.netmgt.telemetry.protocols.netflow.adapter.netflow5.Netflow5Adapter
        at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor.postProcessBeforeInitialization(InitDestroyAnnotationBeanPostProcessor.java:136) ~[org.apache.servicemix.bundles.spring-beans-4.2.9.RELEASE_1.jar:?]
        at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.applyBeanPostProcessorsBeforeInitialization(AbstractAutowireCapableBeanFactory.java:408) ~[org.apache.servicemix.bundles.spring-beans-4.2.9.RELEASE_1.jar:?]
        at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1575) ~[org.apache.servicemix.bundles.spring-beans-4.2.9.RELEASE_1.jar:?]
        at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:399) ~[org.apache.servicemix.bundles.spring-beans-4.2.9.RELEASE_1.jar:?]
        at org.opennms.netmgt.telemetry.daemon.Telemetryd.start(Telemetryd.java:120) ~[org.opennms.features.telemetry.daemon-28.0.2.jar:?]
        at org.opennms.netmgt.daemon.AbstractSpringContextJmxServiceDaemon$2.run(AbstractSpringContextJmxServiceDaemon.java:128) ~[org.opennms.core.daemon-28.0.2.jar:?]
        at org.opennms.core.logging.Logging.withPrefix(Logging.java:71) ~[org.opennms.core.logging-28.0.2.jar:?]
        at org.opennms.netmgt.daemon.AbstractSpringContextJmxServiceDaemon.start(AbstractSpringContextJmxServiceDaemon.java:118) ~[org.opennms.core.daemon-28.0.2.jar:?]
        at jdk.internal.reflect.GeneratedMethodAccessor357.invoke(Unknown Source) ~[?:?]
        at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
        at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
        at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71) ~[?:?]
        at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) ~[?:?]
        at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
        at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
        at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:260) ~[?:?]
       at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) ~[?:?]
        at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) ~[?:?]
        at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) ~[?:?]
        at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) ~[?:?]
        at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252) ~[?:?]
        at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:809) ~[?:?]
        ... 13 more
2021-09-08 17:15:38,517 ERROR [Main] o.o.n.v.Starter: An error occurred while attempting to start the "OpenNMS:Name=Telemetryd" service (class org.opennms.netmgt.daemon.SimpleSpringContextJmxServiceDaemon).  Shutting down and exiting.

I have disabled Netflow5Adapter, then error is Netflow9Adapter not found.
Then i have disabled Netflow9Adapter, then IpfixAdapter not found. :unamused:
Now IpfixAdapter disabled. We’ll see how far that goes.

What is the output of ls -l /opt/opennms/lib/org.opennms.features.telemetry.protocols.netflow.adapter*.jar ?

What is the exact OpenNMS version you are running? rpm -q opennms-core or dpkg -l opennms-core ?

1 Like
-rw-r--r-- 1 root root 17943 Aug 10 23:24 /usr/share/java/opennms/org.opennms.features.telemetry.protocols.netflow.adapter-28.0.2.jar

dpkg-query for opennms:

ii  libopennms-java                              28.0.2-1                    all                         Enterprise-grade Open-source Network Management Platform (OpenNMS Libraries)
ii  libopennmsdeps-java                          28.0.2-1                    all                         Enterprise-grade Open-source Network Management Platform (Required Libraries)
ii  mib2opennms                                  0.3.3-2                     amd64                       Create OpenNMS configuration from MIB files
ii  opennms                                      28.0.2-1                    all                         Enterprise-grade Open-source Network Management Platform (Full Install)
ii  opennms-common                               28.0.2-1                    all                         Enterprise-grade Open-source Network Management Platform (Common Files)
ii  opennms-db                                   28.0.2-1                    all                         Enterprise-grade Open-source Network Management Platform (Database)
ii  opennms-server                               28.0.2-1                    all                         Enterprise-grade Open-source Network Management Platform (Daemon)
ii  opennms-source                               28.0.2-1                    all                         Enterprise-grade Open-source Network Management Platform (Source)
ii  opennms-webapp-jetty                         28.0.2-1 

Wow.
Good idea with the packages. I have compared it with another system. The package “opennms-plugin-protocol-radius” is missing here. (we use radius login)
I installed it and OpenNMS works now.
But in the meantime I have deactivated 5 adapters. I’ll take that back one by one. Let’s see what happens.

Crazy. It was really only due to the missing package “opennms-plugin-protocol-radius”.
This affected the telemetryd, which then affected Karaf.
Now everything is changed back, Karaf is activated again and everything works.

Many thanks for the effort and the steering in the right direction.

PS:
Now I have also found the corresponding error message:

Cannot find class [org.opennms.protocols.radius.springsecurity.RadiusAuthenticationProvider] for bean with name 'externalAuthenticationProvider' 

Yeap! A broken Spring Security configuration will absolutely cause this condition due to the way we load the Karaf container.

Don’t forget to re-enable the karafStartupMonitor :slight_smile:

Dino