Files
monitoring/.gitignore
Tara Subedi 709613a1fa DEV: Implement gnss monitoring in collectd
This commit supports new alarms based upon monitoring ptp-instance.
instance-level alarm: gpsd daemon fault
device-level alarms:
   GNSS signal loss/no lock
   Number of satellites below configured threshold
   Average SNR below configured threshold

This reads configured threshold values and devices from
/etc/linuxptp/ptpinstance/monitoring-*.conf and compares with
gpsd data, to trigger raise/clear alarms.

Unit tests has been added for testing gpsd protocol.

ptp_monitoring_cli.py added for testing gpsd data polling on live
system.

TEST PLAN:
PASS: Deploy on system where no gnss signal received
   system ptp-instance-add test-monitor monitoring
   system ptp-instance-parameter-add test-monitor satellite_count=12
   system ptp-instance-parameter-add test-monitor signal_quality_db=30
   system ptp-instance-parameter-add test-monitor devices="/dev/gnss0 /dev/gnss1"
   system ptp-instance-parameter-add test-monitor cmdline_opts="-D 7"

   system host-ptp-instance-assign controller-0 test-monitor
   system host-update controller-0 clock_synchronization=ptp
   system ptp-instance-apply

   - Below alarms received for both device_path=/dev/gnss0 and
    device_path=/dev/gnss1
      controller-0 GNSS signal quality db below threshold state:
         signal_quality_db 0 (expected: >= 30.0)
      controller-0 GNSS satellite count below threshold state:
         satellite count 0 (expected: >= 12)
      controller-0 GNSS signal loss state: signal lock False (expected: True)

   - gpsd.service process running and no alarm for this service
   - "systemctl stop gpsd.service" triggers "gpsd.service enabled
      but not running"
   - "systemctl start gpsd.service" clears "gpsd.service enabled
      but not running"

PASS: without devices, raises no device-specific alarms are reported,
     no errors on collectd.log
   system ptp-instance-parameter-delete test-monitor devices=
       "/dev/gnss0 /dev/gnss1"
   system ptp-instance-apply
   -  fm alarm-list # no alarms
   - "systemctl stop gpsd.service" triggers "gpsd.service enabled but
     not running" alarm
   - "systemctl start gpsd.service" clears "gpsd.service enabled but
     not running" alarm

PASS: Add wrong satellite_count value, only that alarm get excluded
   system ptp-instance-parameter-add test-monitor devices=
       "/dev/gnss0 /dev/gnss1"
   system ptp-instance-parameter-add test-monitor 'satellite_count=x'
   system ptp-instance-apply
   collectd log: ptp plugin Reading satellite_count from monitoring
       config file /etc/linuxptp/ptpinstance/monitoring-ptp.conf
       failed. error: invalid literal for int() with base 10: 'x'
   - only satellite_count alarm get excluded, working as expected

PASS: Add wrong signal_quality_db value, only that alarm get excluded
   system ptp-instance-parameter-add test-monitor
       'signal_quality_db=100.x'
   system ptp-instance-apply
   collectd log: ptp plugin Reading signal_quality_db from monitoring
       config file /etc/linuxptp/ptpinstance/monitoring-ptp.conf failed.
       error: could not convert string to float: '100.x'
   - No traceback on collectd.log, signal_quality_db has no effect on
       other alarms.

PASS: Test with single device and float value
   system ptp-instance-parameter-add test-monitor devices="/dev/gnss0"
   system ptp-instance-parameter-add test-monitor
       'signal_quality_db=100.7'
   system ptp-instance-apply
   fm alarm-list
   controller-0 GNSS signal quality db below threshold state:
       signal_quality_db 0 (expected: >= 100.7)
   controller-0 GNSS satellite count below threshold state:
       satellite count 0 (expected: >= 5)
   controller-0 GNSS signal loss state: signal lock False (expected: True)

PASS: Deploy on system where gnss signal received
   Test with cli first: sudo python /usr/rootdirs/opt/
       collectd/extensions/python/ptp_monitoring_cli.py
       /dev/gnss0's gps_data: GpsData(gpsd_running=1, lock_state=1,
       satellite_count=10,
       signal_quality_db=SignalQualityDb(min=31.0, max=48.0, avg=43.6))
       /dev/gnss1's gps_data: GpsData(gpsd_running=1, lock_state=1,
       satellite_count=10,
       signal_quality_db=SignalQualityDb(min=30.0, max=48.0, avg=43.5))
       Error: ptp plugin /dev/gnssx is not being monitored by GPSD
       /dev/gnssx's gps_data: GpsData(gpsd_running=1, lock_state=0,
       satellite_count=0, signal_quality_db=SignalQualityDb
       (min=0, max=0, avg=0))

       This shows satellite count and signal_quality_db of devices, that
       can be tested against on following tests.

   system ptp-instance-add test-monitor monitoring
   system ptp-instance-parameter-add test-monitor satellite_count=8
   system ptp-instance-parameter-add test-monitor signal_quality_db=30
   system ptp-instance-parameter-add test-monitor devices=
       "/dev/gnss0 /dev/gnss1"
   system ptp-instance-parameter-add test-monitor cmdline_opts="-D 7"

   system host-ptp-instance-assign controller-0 test-monitor
   system host-update controller-0 clock_synchronization=ptp
   system ptp-instance-apply

   - fm alarm-list # reports no alarms
   - check collectd.log for actual GpsData:
       info ptp plugin instance monitoring-ptp device /dev/gnss0 data: GpsData(..)

  PASS: increase threshold to check device specific alarms triggered
   system ptp-instance-parameter-add test-monitor satellite_count=100
   system ptp-instance-parameter-add test-monitor signal_quality_db=300
   system ptp-instance-apply

   - fm alarm-list # reports both alarms on both devices

  PASS: test with monitoring instance with other instances (except ts2phc)
  PASS: remove monitoring instance, keep other instances

Story: 2011345
Task: 52521

Change-Id: I52d1451cd7cac364bcaeff850a424ddcc8e8de94
Signed-off-by: Tara Nath Subedi <tara.subedi@windriver.com>
2025-07-29 12:01:30 -04:00

5 lines
36 B
Plaintext

.tox
.coverage
.stestr
__pycache__/