Files
nova/releasenotes/notes/bug-2122036-hypervisor-uptime-performance-optimization-6f3a2c8e5d9b1a4e.yaml
Sean Mooney 567dbe1867 hypervisors: Optimize uptime retrieval for better performance
The /os-hypervisors/detail API endpoint was experiencing significant
performance issues in environments with many compute nodes when using
microversion 2.88 or higher, as it made sequential RPC calls to gather
uptime information from each compute node.

This change optimizes uptime retrieval by:

* Adding uptime to periodic resource updates sent by nova-compute to the
  database, eliminating synchronous RPC calls during API requests
* Restricting RPC-based uptime retrieval to hypervisor types that support
  it (libvirt and z/VM), avoiding unnecessary calls that would always fail
* Preferring cached database uptime data over RPC calls when available

Closes-Bug: #2122036
Assisted-By: Claude <noreply@anthropic.com>
Change-Id: I5723320f578192f7e0beead7d5df5d7e47d54d2b
Co-Authored-By: Sylvain Bauza <sbauza@redhat.com>
Signed-off-by: Sean Mooney <work@seanmooney.info>
2025-09-05 19:03:38 +01:00

24 lines
1.1 KiB
YAML

---
fixes:
- |
Fixed performance issue with the ``/os-hypervisors/detail`` API endpoint
when using microversion 2.88 or higher. The API was making sequential RPC
calls to each compute node to gather uptime information, causing significant
delays in environments with many compute nodes (LP#2122036).
The fix optimizes uptime retrieval by:
* Adding uptime information to the periodic resource updates sent by
nova-compute to the database, eliminating the need for synchronous RPC
calls during API requests
* Only attempting RPC-based uptime retrieval for hypervisor types that
actually support it (libvirt and z/VM), avoiding unnecessary calls to
other hypervisor types that would always return NotImplementedError
* Preferring cached uptime data from the database over RPC calls when
available, this updates at the cadence specified by
`[DEFAULT]update_resources_interval` which is the same interval the
other hypervisor stats update.
This change significantly reduces response times for the hypervisor detail
API in large deployments while maintaining backward compatibility.