Troubleshoot Telemetryd

A checklist of things that can be checked if telemetryd does not work as expected, which is of course a very generic term. Feel free to add other checks.

Telemetryd Availability

Make sure all components can reach each other.
Is a Minion or OpenNMS listening on the correct port? Can routers reach them on this port?

You can use nmap to test the port from the outside perspective:

[14:25]root@some_server:~# nmap {MINION|OPENNMS} -p 9999 -sU

Starting Nmap 7.60 ( https://nmap.org ) at 2021-01-30 14:25 UTC
Nmap scan report for {MINION|OPENNMS}
Host is up.

PORT     STATE         SERVICE
9999/udp open|filtered distinct

Nmap done: 1 IP address (1 host up) scanned in 2.16 seconds

On your OpenNMS or Minion server you can use netstat to check if the application is listening on the correct port:

[14:21]root@minion-vm:~# netstat -tulpn | grep 9999
udp6       0      0 :::9999                 :::*                                11494/docker-proxy 

Are flow packages incoming?

You can use tcpdump on the OpenNMS/Minion server to check if the routers sending data. Just filter for udp traffic on the correct interface and port.

[14:20]root@{MINION|OPENNMS}:~# tcpdump -i ens18 port 9999 and udp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens18, link-type EN10MB (Ethernet), capture size 262144 bytes
14:21:01.716729 IP flow_router.2040 > {MINION|OPENNMS}.9999: UDP, length 1316
14:21:01.746687 IP flow_router.2040 > {MINION|OPENNMS}.9999: UDP, length 1300
14:21:01.746840 IP flow_router.2040 > {MINION|OPENNMS}.9999: UDP, length 1316
14:21:01.746859 IP flow_router.2040 > {MINION|OPENNMS}.9999: UDP, length 1316
14:21:01.746868 IP flow_router.2040 > {MINION|OPENNMS}.9999: UDP, length 1316
14:21:01.746877 IP flow_router.2040 > {MINION|OPENNMS}.9999: UDP, length 1316
14:21:01.756641 IP flow_router.2040 > {MINION|OPENNMS}.9999: UDP, length 1140
14:21:01.756843 IP flow_router.2040 > {MINION|OPENNMS}.9999: UDP, length 1308
14:21:01.756864 IP flow_router.2040 > {MINION|OPENNMS}.9999: UDP, length 1332
14:21:01.756874 IP flow_router.2040 > {MINION|OPENNMS}.9999: UDP, length 1340
14:21:01.756881 IP flow_router.2040 > {MINION|OPENNMS}.9999: UDP, length 1316
14:21:01.756889 IP flow_router.2040 > {MINION|OPENNMS}.9999: UDP, length 1316
14:21:01.795613 IP flow_router.2040 > {MINION|OPENNMS}.9999: UDP, length 1316
14:21:02.020531 IP flow_router.2040 > {MINION|OPENNMS}.9999: UDP, length 1316
14:21:02.386730 IP flow_router.2040 > {MINION|OPENNMS}.9999: UDP, length 1292
14:21:02.406797 IP flow_router.2040 > {MINION|OPENNMS}.9999: UDP, length 1340
^C
16 packets captured
19 packets received by filter
0 packets dropped by kernel

Docker Networking

When running OpenNMS in Docker it is recommended to use the network_mode: host. Otherwise, Docker will replace the sender IP address (router) with the Docker network proxy IP.

[root@0a8c0a9reg9g opennms]# tcpdump -i eth0 port 9999 and udp -n
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
14:02:41.660000 IP 172.29.0.1.47239 > OPENNMS_DOCKER_IP.distinct: UDP, length 1316
14:02:41.700059 IP 172.29.0.1.47239 > OPENNMS_DOCKER_IP.distinct: UDP, length 1316
14:02:41.780061 IP 172.29.0.1.47239 > OPENNMS_DOCKER_IP.distinct: UDP, length 1244
14:02:41.780068 IP 172.29.0.1.47239 > OPENNMS_DOCKER_IP.distinct: UDP, length 60
...

Minion Health Check

Within the Karaf shell a health check can be run:

[14:28]root@minion-vom:~# ssh admin@localhost -p 8201
The authenticity of host '[localhost]:8201 ([::1]:8201)' can't be established.
RSA key fingerprint is SHA256:581Du6j1YTvVFnrZ5gqoNc9ZwEC9afYlu6WbrDtKtxU.
Are you sure you want to continue connecting (yes/no)? yes  
Warning: Permanently added '[localhost]:8201' (RSA) to the list of known hosts.
Password authentication
Password: 

    ,-.-.o     o                
    | | |.,---..,---.,---.      
    | | |||   |||   ||   |      
    ` ' '``   '``---'`   '      

  OpenNMS Minion (27.0.2) on Apache Karaf (4.2.6)

Hit '<tab>' for a list of available commands
and '[cmd] --help' for help on a specific command.
Hit '<ctrl-d>' to exit this console.
Use 'osgi:shutdown' to shutdown OpenNMS Minion.

admin@minion> opennms:health-check 
Verifying the health of the container

Connecting to OpenNMS ReST API           [ Success  ]
Verifying installed bundles              [ Success  ]
Connecting to Kafka from RPC             [ Success  ]
Connecting to Kafka from Sink Producer   [ Success  ]

=> Everything is awesome
admin@minion> 

ONMS Health Check

Within the Karaf shell a health check can be run:

[14:29]root@opennms:~# ssh  -p 8101 admin@localhost 
Password authentication
Password: 

   ____                   _   _ __  __  _____  
  / __ \                 | \ | |  \/  |/ ____| 
 | |  | |_ __   ___ _ __ |  \| | \  / | (___   
 | |  | | '_ \ / _ \ '_ \| . ` | |\/| |\___ \  
 | |__| | |_) |  __/ | | | |\  | |  | |____) | 
  \____/| .__/ \___|_| |_|_| \_|_|  |_|_____/  
        | |                                    
        |_|                                    

  OpenNMS (27.0.2) on Apache Karaf (4.2.6)

Hit '<tab>' for a list of available commands
and '[cmd] --help' for help on a specific command.
Hit '<ctrl-d>' to exit this console.

admin@opennms> opennms:health-check 
Verifying the health of the container

Verifying installed bundles                                       [ Success  ]
Connecting to ElasticSearch ReST API (Flows)                      [ Success  ]
Number of active alarms stored in Elasticsearch (Alarm History)   [ Success  ] => Found 7 alarms.

=> Everything is awesome
admin@opennms> 

Sink Consumer Graphs

The Sink Consumer Graphs should not be empty. They can be found in your OpenNMS nodes in the resource graph menu.

Flow providing nodes require SNMP

SNMP is required on the routers that provide Netflow. It is used to map the interfaces correctly by using the interface index.

Logs

OpenNMS

Two files are relevant. Search for exception or error.

/var/log/opennms/karaf.log
/var/log/opennms/telemetryd.log

Minion

TBD


:woman_facepalming: You can fix me, Iā€™m a wiki post.

2 Likes