Under PREEMPT_RT, __put_task_struct() indirectly acquires sleeping
locks. Therefore, it can't be called from an non-preemptible context.
Instead of calling __put_task_struct() directly, we defer it using
call_rcu(). A more natural approach would use a workqueue, but since
in PREEMPT_RT, we can't allocate dynamic memory from atomic context,
the code would become more complex because we would need to put the
work_struct instance in the task_struct and initialize it when we
allocate a new task_struct.
We met 5 same panics, __put_task_struct is called during the process
holding a lock that caused the kernel BUG_ON. The below is the call
trace.
We also need cherry pick the following commits, because the necessary
context is not in 5.10.18x, such as there is not definition
DEFINE_WAIT_OVERRIDE_MAP.
* commit 5f2962401c6e
("locking/lockdep: Exclude local_lock_t from IRQ inversions")
* commit 175b1a60e880
("locking/lockdep: Clean up check_redundant() a bit")
* commit bc2dd71b2836
("locking/lockdep: Add a skip() function to __bfs()")
* commit 0cce06ba859a
("debugobjects,locking: Annotate debug_object_fill_pool() wait type
violation")
kernel BUG at kernel/locking/rtmutex.c:1331!
invalid opcode: 0000 [#1] PREEMPT_RT SMP NOPTI
......
Call Trace:
rt_spin_lock_slowlock_locked+0xb2/0x2a0
? update_load_avg+0x80/0x690
rt_spin_lock_slowlock+0x50/0x80
? update_load_avg+0x80/0x690
rt_spin_lock+0x2a/0x30
free_unref_page+0xc5/0x280
__vunmap+0x17f/0x240
put_task_stack+0xc6/0x130
__put_task_struct+0x3d/0x180
rt_mutex_adjust_prio_chain+0x365/0x7b0
task_blocks_on_rt_mutex+0x1eb/0x370
rt_spin_lock_slowlock_locked+0xb2/0x2a0
rt_spin_lock_slowlock+0x50/0x80
rt_spin_lock+0x2a/0x30
free_unref_page_list+0x128/0x5e0
release_pages+0x2b4/0x320
tlb_flush_mmu+0x44/0x150
tlb_finish_mmu+0x3c/0x70
zap_page_range+0x12a/0x170
? find_vma+0x16/0x70
do_madvise+0x99d/0xba0
? do_epoll_wait+0xa2/0xe0
? __x64_sys_madvise+0x26/0x30
__x64_sys_madvise+0x26/0x30
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Verification:
- build-pkgs; build-iso; install and boot up on aio-sx lab.
- Can not reproduce the isue during the stress-ng test for almost 24 hours.
while true; do sudo stress-ng --sched rr --mmapfork 23 -t 20; done
while true; do sudo stress-ng --sched fifo--mmapfork 23 -t 20; done
Closes-Bug: 2031597
Signed-off-by: Jiping Ma <jiping.ma2@windriver.com>
Change-Id: If022441d61492eaec88eede8603a6cb052af99d1