Go to file
Ítalo Vieira fe71102cf8 Fix problem of OSD down when performing DOR in both controllers
When there is no active controller during host startup, the variable
$service_enabled in ceph.pp assumes its default value of False, which
causes Puppet to skip the section responsible for mounting the OSD.

This scenario happens, for example, when a DOR is performed in both
controllers in AIO-DX simultaneously.

In this scenario, the OSD mount is expected to be triggered by the OSD
start in /etc/init.d/ceph and not by puppet. However, since the OSD
mountpoint is not available, /etc/init.d/ceph cannot find the
.osd_configured file, and the OSD start is skipped. As a result, the OSD
remains unmounted, leading to the Ceph OSD being down.

To fix this, /etc/init.d/ceph now checks for the .osd_configured
flag only if the OSD is already mounted. This allows the script to
proceed and mount the OSD as expected.

Additionally, during OSD startup in /etc/init.d/ceph, before mounting,
the /dev/sdx1 device for a OSD, the PARTLABEL is verified. If it does
not match "ceph data", it indicates the OSD has not yet been created by
Puppet, and the start process is skipped to prevent the OSD from being
started prematurely which could lead to errors during Puppet OSD
configuration.
During a restore in progress, the OSD is already mounted but not yet
defined in ceph.conf. In this state, $first_dev cannot be resolved,
which would cause the PARTLABEL check to fail. Since the restore process
must start the OSD, the PARTLABEL check is skipped while a restore is in
progress.

Test-Plan:
  PASS: build-pkgs and build-image
  PASS: [SX, DX, STD] Fresh install with ceph bare metal
  PASS: [SX, DX, STD] DOR tests ensuring the OSD is mounted/started
    - Resetting all hosts at the same time
    - Resetting one host at a time
  PASS: [SX, DX, STD] host-swact
  PASS: [SX, DX, STD] lock-unlock standby controller
  PASS: [SX, DX, STD] Add OSDs at runtime
  PASS: Make /etc/init.d/ceph be called for new OSD before puppet
        configuration and observe the OSD being skipped in ceph
        script
  PASS: Make /etc/init.d/ceph be called for a already configured
        OSD before puppet configuration and observe the OSD is not
        affected
  PASS: Make /etc/init.d/ceph be called for a new OSD
        during puppet configuration and observe the OSD being skipped
        in ceph script
  PASS: Make /etc/init.d/ceph be called right before final puppet OSD
        mount and observe the OSD being skipped in ceph script
  PASS: Confirm PARTLABEL check works for SATA, NVMe and multipath
  PASS: [SX, DX, STD] Perform BnR
  PASS: [DX] Perform Platform Upgrade to the built image

Closes-Bug: 2127198
Related-Bug: 2125764
Change-Id: I81e466c70027a88814ca97d6e67a24b72baef531
Signed-off-by: Ítalo Vieira <italo.gomesvieira@windriver.com>
2025-10-10 17:51:15 -03:00
2023-08-29 16:52:04 -03:00
2024-05-01 16:39:19 -04:00
2024-05-01 16:39:19 -04:00
2019-01-08 11:42:04 -05:00
2019-04-19 19:52:31 +00:00
2023-09-06 17:54:55 -03:00
2021-09-09 19:05:36 +03:00
2018-05-31 07:36:35 -07:00
2025-03-10 09:13:52 -03:00

integ

StarlingX Integration

Description
StarlingX Integration and packaging
Readme 56 MiB
Languages
Shell 27.9%
Python 22.5%
JavaScript 21.3%
Perl 12.7%
C++ 5.7%
Other 9.8%