Merge "Revert "Disable lock stealing to prevent starvation under extreme circumstances""

This commit is contained in:
Zuul
2025-10-03 17:58:08 +00:00
committed by Gerrit Code Review
4 changed files with 0 additions and 142 deletions

View File

@@ -1,70 +0,0 @@
From 7e61c0d0cd295008bc2d895eff474c21f4051040 Mon Sep 17 00:00:00 2001
From: Mark Asselstine <mark.asselstine@windriver.com>
Date: Sat, 2 Aug 2025 10:48:05 -0400
Subject: [PATCH] locking/rtmutex.c: disable lock stealing
The rt_mutex allows lock stealing. If a process with a higher priority
wants a lock and the top waiter is of lower priority the higher
priority process will steal the lock. This avoid adding the higher
priority task from having to be added to the waiter list and then
relying on priority inheritance to be applied to get the lock in place
of the lower priority top waiter.
This lock stealing approach makes a lot of sense. But under an extreme
circumstance this will cause an issue.
If we have 3 tasks all pinned to a CPU
- High priority RT tasks, let's call these A
- Lower priority RT tasks, let's call these B
- Non-RT tasks with lower priority, let's call these Z
We exec enough A tasks to monopolize the CPU, the scheduler will run A
tasks, except thanks to "sched_rt_runtime_us" we can ensure the
scheduler will run some of Z tasks. But this results in no CPU time
for B tasks. If any of these B tasks were in the process of being
created or were exiting when A tasks started to monopolize the CPU
then a B task will be in the tasklist_lock waiter list, and eventually
be the top waiter. Thanks to lock stealing, A tasks can continue to be
created, run and exit, as they will steal the tasklist_lock. However,
the B task that is top waiter will be stuck waiting for time on CPU
(which the scheduler never gives it) and it along with any other tasks
on any CPU requiring the tasklist_lock will wait perpetually. The only
thing that can happen to break this is for the CPU to get through all
A tasks work, at which point the B task top waiter will get on CPU,
then all other waiters will be worked through.
Under normal circumstances the above scenario would not happen, the B
task would be migrated to another CPU for example. But with the B task
pinned to the same CPU as A tasks using all RT time on CPU and new A
tasks continuing to be created without issue, due to stealing, we end
up in this situation.
Here we disable lock stealing. The expected outcome is that the higher
priority tasks would not be able to steal the lock, instead they will
be added to the waiter tree where priority inheritance will be
applied. The lower priority top waiter will be boosted, get time on
CPU and we avoid the situation described above.
Signed-off-by: Mark Asselstine <mark.asselstine@windriver.com>
---
kernel/locking/rtmutex.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 4a10e8c16fd2..7ae6f052e2fe 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -426,8 +426,8 @@ static __always_inline int rt_waiter_node_equal(struct rt_waiter_node *left,
static inline bool rt_mutex_steal(struct rt_mutex_waiter *waiter,
struct rt_mutex_waiter *top_waiter)
{
- if (rt_waiter_node_less(&waiter->tree, &top_waiter->tree))
- return true;
+ /* Disable lock stealing to force PI */
+ return false;
#ifdef RT_MUTEX_BUILD_SPINLOCKS
/*
--
2.34.1

View File

@@ -39,4 +39,3 @@ zl3073x-backport/0004-dpll-zl3073x-Fix-missing-header-build-error-on-older.patch
zl3073x-backport/0005-dpll-add-phase-offset-monitor-feature-to-netlink-spe.patch
zl3073x-backport/0006-dpll-add-phase_offset_monitor_get-set-callback-ops.patch
0015-sched-fair-Fix-DELAY_DEQUEUE-issue-related-to-cgroup.patch
0016-locking-rtmutex.c-disable-lock-stealing.patch

View File

@@ -1,70 +0,0 @@
From 7e61c0d0cd295008bc2d895eff474c21f4051040 Mon Sep 17 00:00:00 2001
From: Mark Asselstine <mark.asselstine@windriver.com>
Date: Sat, 2 Aug 2025 10:48:05 -0400
Subject: [PATCH] locking/rtmutex.c: disable lock stealing
The rt_mutex allows lock stealing. If a process with a higher priority
wants a lock and the top waiter is of lower priority the higher
priority process will steal the lock. This avoid adding the higher
priority task from having to be added to the waiter list and then
relying on priority inheritance to be applied to get the lock in place
of the lower priority top waiter.
This lock stealing approach makes a lot of sense. But under an extreme
circumstance this will cause an issue.
If we have 3 tasks all pinned to a CPU
- High priority RT tasks, let's call these A
- Lower priority RT tasks, let's call these B
- Non-RT tasks with lower priority, let's call these Z
We exec enough A tasks to monopolize the CPU, the scheduler will run A
tasks, except thanks to "sched_rt_runtime_us" we can ensure the
scheduler will run some of Z tasks. But this results in no CPU time
for B tasks. If any of these B tasks were in the process of being
created or were exiting when A tasks started to monopolize the CPU
then a B task will be in the tasklist_lock waiter list, and eventually
be the top waiter. Thanks to lock stealing, A tasks can continue to be
created, run and exit, as they will steal the tasklist_lock. However,
the B task that is top waiter will be stuck waiting for time on CPU
(which the scheduler never gives it) and it along with any other tasks
on any CPU requiring the tasklist_lock will wait perpetually. The only
thing that can happen to break this is for the CPU to get through all
A tasks work, at which point the B task top waiter will get on CPU,
then all other waiters will be worked through.
Under normal circumstances the above scenario would not happen, the B
task would be migrated to another CPU for example. But with the B task
pinned to the same CPU as A tasks using all RT time on CPU and new A
tasks continuing to be created without issue, due to stealing, we end
up in this situation.
Here we disable lock stealing. The expected outcome is that the higher
priority tasks would not be able to steal the lock, instead they will
be added to the waiter tree where priority inheritance will be
applied. The lower priority top waiter will be boosted, get time on
CPU and we avoid the situation described above.
Signed-off-by: Mark Asselstine <mark.asselstine@windriver.com>
---
kernel/locking/rtmutex.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 4a10e8c16fd2..7ae6f052e2fe 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -426,8 +426,8 @@ static __always_inline int rt_waiter_node_equal(struct rt_waiter_node *left,
static inline bool rt_mutex_steal(struct rt_mutex_waiter *waiter,
struct rt_mutex_waiter *top_waiter)
{
- if (rt_waiter_node_less(&waiter->tree, &top_waiter->tree))
- return true;
+ /* Disable lock stealing to force PI */
+ return false;
#ifdef RT_MUTEX_BUILD_SPINLOCKS
/*
--
2.34.1

View File

@@ -37,4 +37,3 @@ zl3073x-backport/0003-devlink-introduce-devlink_nl_put_u64.patch
zl3073x-backport/0004-dpll-zl3073x-Fix-missing-header-build-error-on-older.patch
zl3073x-backport/0005-dpll-add-phase-offset-monitor-feature-to-netlink-spe.patch
zl3073x-backport/0006-dpll-add-phase_offset_monitor_get-set-callback-ops.patch
0016-locking-rtmutex.c-disable-lock-stealing.patch