Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added mounts module #17

Merged
merged 1 commit into from
Nov 29, 2023
Merged

Conversation

gtmoth
Copy link
Member

@gtmoth gtmoth commented Nov 6, 2023

Added a module called mounts to corelens to display mountinfo from a given vmcore.

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Nov 6, 2023
@brenns10 brenns10 self-requested a review November 7, 2023 21:27
doc/api.rst Outdated Show resolved Hide resolved
drgn_tools/corelens.py Outdated Show resolved Hide resolved
tests/test_mounts.py Outdated Show resolved Hide resolved
tests/test_mounts.py Show resolved Hide resolved
tests/test_mounts.py Show resolved Hide resolved
drgn_tools/mounts.py Show resolved Hide resolved
Copy link

Thank you for your pull request and welcome to our community! To contribute, please sign the Oracle Contributor Agreement (OCA).
The following contributors of this PR have not signed the OCA:

To sign the OCA, please create an Oracle account and sign the OCA in Oracle's Contributor Agreement Application.

When signing the OCA, please provide your GitHub username. After signing the OCA and getting an OCA approval from Oracle, this PR will be automatically updated.

If you are an Oracle employee, please make sure that you are a member of the main Oracle GitHub organization, and your membership in this organization is public.

@oracle-contributor-agreement oracle-contributor-agreement bot added OCA Required At least one contributor does not have an approved Oracle Contributor Agreement. and removed OCA Verified All contributors have signed the Oracle Contributor Agreement. labels Nov 14, 2023
@gtmoth gtmoth force-pushed the drgn-tools-mounts branch 5 times, most recently from 8eb987a to 8717b75 Compare November 14, 2023 09:40
@brenns10
Copy link
Member

It looks like clicking "accept suggestion" in the Pull Request review will create a commit that credits me using my non-oracle email in a "Co-developed-by' tag. Then the OCA bot will check that email and fail the contributor agreement tag.

Feel free to just squash everything into a single commit and remove the Co-developed-by tags :)

@oracle-contributor-agreement oracle-contributor-agreement bot added OCA Verified All contributors have signed the Oracle Contributor Agreement. and removed OCA Required At least one contributor does not have an approved Oracle Contributor Agreement. labels Nov 21, 2023
drgn_tools/mounts.py Show resolved Hide resolved
This module lists all the mount points from a given vmcore

Signed-off-by: Gautham Ananthakrishna <[email protected]>
@brenns10 brenns10 merged commit 4db1ec8 into oracle-samples:main Nov 29, 2023
5 checks passed
imran-kn added a commit that referenced this pull request Feb 3, 2024
…lock.

We have ran into situtaions where optimistic spin on sleeping lock by
multiple CPUs causes high sys load, which further leads to other issues.

This helper allows to check for such spinners.

Some examples from a vmcore where system ran into fork-bomb:

>>> scan_osq_node(prog)
There are spinners on one or more osq_lock
CPU(s): [21, 17, 59, 67, 6, 0, 28, 53, 4, 41, 55, 38, 63, 23, 27, 19, 36] are spinning on same osq_lock
CPU(s): [42, 56, 35, 18, 22, 3, 1] are spinning on same osq_lock
CPU(s): [70, 44, 9] are spinning on same osq_lock
CPU(s): [16, 26] are spinning on same osq_lock

>>> scan_osq_node(prog,1)
There are spinners on one or more osq_lock
CPU(s): [21, 17, 59, 67, 6, 0, 28, 53, 4, 41, 55, 38, 63, 23, 27, 19, 36] are spinning on same osq_lock
CPU: 21 has been spinning for 1.2 seconds.
CPU: 17 has been spinning for 1.3 seconds.
CPU: 59 has been spinning for 1.3 seconds.
CPU: 67 has been spinning for 1.5 seconds.
.......................................
.......................................
CPU: 19 has been spinning for 3.004 seconds.
CPU: 36 has been spinning for 3.1 seconds.
......................................

>>> scan_osq_node(prog,2)

There are spinners on one or more osq_lock
CPU(s): [21, 17, 59, 67, 6, 0, 28, 53, 4, 41, 55, 38, 63, 23, 27, 19, 36] are spinning on same osq_lock
CPU: 21 has been spinning for 0.0 seconds.

Call stack at cpu:  21
PID: 64691    TASK: ffff8fac01282f80 [R] CPU: 21!  COMMAND: "bash"
!# 0 [ffffa447d645fb58] node_cpu at 0xffffffff960f3748 kernel/locking/osq_lock.c:27:0
!# 1 [ffffa447d645fb58] osq_lock at 0xffffffff960f3748 kernel/locking/osq_lock.c:143:0
 # 2 [ffffa447d645fb68] rwsem_optimistic_spin at 0xffffffff96899cc0 kernel/locking/rwsem-xadd.c:451:0
 # 3 [ffffa447d645fb68] __rwsem_down_write_failed_common at 0xffffffff96899cc0 kernel/locking/rwsem-xadd.c:529:0
 # 4 [ffffa447d645fb68] rwsem_down_write_failed at 0xffffffff96899cc0 kernel/locking/rwsem-xadd.c:617:0
 # 5 [ffffa447d645fbf8] call_rwsem_down_write_failed at 0xffffffff96889537 arch/x86/lib/rwsem.S:117:0
 # 6 [ffffa447d645fc40] __down_write at 0xffffffff96898bcd arch/x86/include/asm/rwsem.h:142:0
 # 7 [ffffa447d645fc40] down_write at 0xffffffff96898bcd kernel/locking/rwsem.c:72:0
 # 8 [ffffa447d645fc58] lock_anon_vma_root at 0xffffffff9623ae3c mm/rmap.c:239:0
 # 9 [ffffa447d645fc58] unlink_anon_vmas at 0xffffffff9623ae3c mm/rmap.c:390:0
 #10 [ffffa447d645fca8] free_pgtables at 0xffffffff96225231 mm/memory.c:406:0
 #11 [ffffa447d645fce8] exit_mmap at 0xffffffff96230f48 mm/mmap.c:3212:0
 #12 [ffffa447d645fd98] __mmput at 0xffffffff96097ab2 kernel/fork.c:969:0
 #13 [ffffa447d645fd98] mmput at 0xffffffff96097ab2 kernel/fork.c:990:0
 #14 [ffffa447d645fdb8] copy_process at 0xffffffff96098801 kernel/fork.c:2058:0
 #15 [ffffa447d645fea0] copy_process at 0xffffffff96099db9 kernel/fork.c:1653:0
 #16 [ffffa447d645fea0] _do_fork at 0xffffffff96099db9 kernel/fork.c:2153:0
 #17 [ffffa447d645ff20] SYSC_clone at 0xffffffff9609a149 kernel/fork.c:2265:0
 #18 [ffffa447d645ff20] SyS_clone at 0xffffffff9609a149 kernel/fork.c:2259:0
 #19 [ffffa447d645ff30] do_syscall_64 at 0xffffffff96003ca9 arch/x86/entry/common.c:298:0
 #20 [ffffa447d645ff58] entry_SYSCALL_64 at 0xffffffff96a001b1 arch/x86/entry/entry_64.S:238:0
.......................................
.......................................
CPU: 36 has been spinning for 3.1 seconds.

Call stack at cpu:  36
PID: 16329    TASK: ffff8f85db930000 [R] CPU: 36!  COMMAND: bash
 0 [ffffa4481243fb68] rwsem_try_write_lock_unqueued at 0xffffffff96899d00 kernel/locking/rwsem-xadd.c:364:0
 1 [ffffa4481243fb68] rwsem_optimistic_spin at 0xffffffff96899d00 kernel/locking/rwsem-xadd.c:465:0
 2 [ffffa4481243fb68] __rwsem_down_write_failed_common at 0xffffffff96899d00 kernel/locking/rwsem-xadd.c:529:0
 3 [ffffa4481243fb68] rwsem_down_write_failed at 0xffffffff96899d00 kernel/locking/rwsem-xadd.c:617:0
 # 4 [ffffa4481243fbf8] call_rwsem_down_write_failed at 0xffffffff96889537 arch/x86/lib/rwsem.S:117:0
 # 5 [ffffa4481243fc40] __down_write at 0xffffffff96898bcd arch/x86/include/asm/rwsem.h:142:0
 # 6 [ffffa4481243fc40] down_write at 0xffffffff96898bcd kernel/locking/rwsem.c:72:0
 # 7 [ffffa4481243fc58] lock_anon_vma_root at 0xffffffff9623ae3c mm/rmap.c:239:0
 # 8 [ffffa4481243fc58] unlink_anon_vmas at 0xffffffff9623ae3c mm/rmap.c:390:0
 # 9 [ffffa4481243fca8] free_pgtables at 0xffffffff96225231 mm/memory.c:406:0
 #10 [ffffa4481243fce8] exit_mmap at 0xffffffff96230f48 mm/mmap.c:3212:0
 #11 [ffffa4481243fd98] __mmput at 0xffffffff96097ab2 kernel/fork.c:969:0
 #12 [ffffa4481243fd98] mmput at 0xffffffff96097ab2 kernel/fork.c:990:0
 #13 [ffffa4481243fdb8] copy_process at 0xffffffff96098801 kernel/fork.c:2058:0
 #14 [ffffa4481243fea0] copy_process at 0xffffffff96099db9 kernel/fork.c:1653:0
 #15 [ffffa4481243fea0] _do_fork at 0xffffffff96099db9 kernel/fork.c:2153:0
 #16 [ffffa4481243ff20] SYSC_clone at 0xffffffff9609a149 kernel/fork.c:2265:0
 #17 [ffffa4481243ff20] SyS_clone at 0xffffffff9609a149 kernel/fork.c:2259:0
 #18 [ffffa4481243ff30] do_syscall_64 at 0xffffffff96003ca9 arch/x86/entry/common.c:298:0
 #19 [ffffa4481243ff58] entry_SYSCALL_64 at 0xffffffff96a001b1 arch/x86/entry/entry_64.S:238:0

Signed-off-by: Imran Khan <[email protected]>
imran-kn added a commit that referenced this pull request Mar 14, 2024
…lock(s).

We have ran into situtaions where optimistic spin on sleeping lock by
multiple CPUs causes high sys load, which further leads to other issues.

This helper allows to check for such spinners.

Some examples from a vmcore where system ran into fork-bomb:

>>> scan_osq_node(prog)
There are spinners on one or more osq_lock
CPU(s): [21, 17, 59, 67, 6, 0, 28, 53, 4, 41, 55, 38, 63, 23, 27, 19, 36] are spinning on same osq_lock
CPU(s): [42, 56, 35, 18, 22, 3, 1] are spinning on same osq_lock
CPU(s): [70, 44, 9] are spinning on same osq_lock
CPU(s): [16, 26] are spinning on same osq_lock

>>> scan_osq_node(prog,1)
There are spinners on one or more osq_lock
CPU(s): [21, 17, 59, 67, 6, 0, 28, 53, 4, 41, 55, 38, 63, 23, 27, 19, 36] are spinning on same osq_lock
CPU: 21 has been spinning for 1.2 seconds.
CPU: 17 has been spinning for 1.3 seconds.
CPU: 59 has been spinning for 1.3 seconds.
CPU: 67 has been spinning for 1.5 seconds.
.......................................
.......................................
CPU: 19 has been spinning for 3.004 seconds.
CPU: 36 has been spinning for 3.1 seconds.
......................................

>>> scan_osq_node(prog,2)

There are spinners on one or more osq_lock
CPU(s): [21, 17, 59, 67, 6, 0, 28, 53, 4, 41, 55, 38, 63, 23, 27, 19, 36] are spinning on same osq_lock
CPU: 21 has been spinning for 0.0 seconds.

Call stack at cpu:  21
PID: 64691    TASK: ffff8fac01282f80 [R] CPU: 21!  COMMAND: "bash"
!# 0 [ffffa447d645fb58] node_cpu at 0xffffffff960f3748 kernel/locking/osq_lock.c:27:0
!# 1 [ffffa447d645fb58] osq_lock at 0xffffffff960f3748 kernel/locking/osq_lock.c:143:0
 # 2 [ffffa447d645fb68] rwsem_optimistic_spin at 0xffffffff96899cc0 kernel/locking/rwsem-xadd.c:451:0
 # 3 [ffffa447d645fb68] __rwsem_down_write_failed_common at 0xffffffff96899cc0 kernel/locking/rwsem-xadd.c:529:0
 # 4 [ffffa447d645fb68] rwsem_down_write_failed at 0xffffffff96899cc0 kernel/locking/rwsem-xadd.c:617:0
 # 5 [ffffa447d645fbf8] call_rwsem_down_write_failed at 0xffffffff96889537 arch/x86/lib/rwsem.S:117:0
 # 6 [ffffa447d645fc40] __down_write at 0xffffffff96898bcd arch/x86/include/asm/rwsem.h:142:0
 # 7 [ffffa447d645fc40] down_write at 0xffffffff96898bcd kernel/locking/rwsem.c:72:0
 # 8 [ffffa447d645fc58] lock_anon_vma_root at 0xffffffff9623ae3c mm/rmap.c:239:0
 # 9 [ffffa447d645fc58] unlink_anon_vmas at 0xffffffff9623ae3c mm/rmap.c:390:0
 #10 [ffffa447d645fca8] free_pgtables at 0xffffffff96225231 mm/memory.c:406:0
 #11 [ffffa447d645fce8] exit_mmap at 0xffffffff96230f48 mm/mmap.c:3212:0
 #12 [ffffa447d645fd98] __mmput at 0xffffffff96097ab2 kernel/fork.c:969:0
 #13 [ffffa447d645fd98] mmput at 0xffffffff96097ab2 kernel/fork.c:990:0
 #14 [ffffa447d645fdb8] copy_process at 0xffffffff96098801 kernel/fork.c:2058:0
 #15 [ffffa447d645fea0] copy_process at 0xffffffff96099db9 kernel/fork.c:1653:0
 #16 [ffffa447d645fea0] _do_fork at 0xffffffff96099db9 kernel/fork.c:2153:0
 #17 [ffffa447d645ff20] SYSC_clone at 0xffffffff9609a149 kernel/fork.c:2265:0
 #18 [ffffa447d645ff20] SyS_clone at 0xffffffff9609a149 kernel/fork.c:2259:0
 #19 [ffffa447d645ff30] do_syscall_64 at 0xffffffff96003ca9 arch/x86/entry/common.c:298:0
 #20 [ffffa447d645ff58] entry_SYSCALL_64 at 0xffffffff96a001b1 arch/x86/entry/entry_64.S:238:0
.......................................
.......................................
CPU: 36 has been spinning for 3.1 seconds.

Call stack at cpu:  36
PID: 16329    TASK: ffff8f85db930000 [R] CPU: 36!  COMMAND: bash
 0 [ffffa4481243fb68] rwsem_try_write_lock_unqueued at 0xffffffff96899d00 kernel/locking/rwsem-xadd.c:364:0
 1 [ffffa4481243fb68] rwsem_optimistic_spin at 0xffffffff96899d00 kernel/locking/rwsem-xadd.c:465:0
 2 [ffffa4481243fb68] __rwsem_down_write_failed_common at 0xffffffff96899d00 kernel/locking/rwsem-xadd.c:529:0
 3 [ffffa4481243fb68] rwsem_down_write_failed at 0xffffffff96899d00 kernel/locking/rwsem-xadd.c:617:0
 # 4 [ffffa4481243fbf8] call_rwsem_down_write_failed at 0xffffffff96889537 arch/x86/lib/rwsem.S:117:0
 # 5 [ffffa4481243fc40] __down_write at 0xffffffff96898bcd arch/x86/include/asm/rwsem.h:142:0
 # 6 [ffffa4481243fc40] down_write at 0xffffffff96898bcd kernel/locking/rwsem.c:72:0
 # 7 [ffffa4481243fc58] lock_anon_vma_root at 0xffffffff9623ae3c mm/rmap.c:239:0
 # 8 [ffffa4481243fc58] unlink_anon_vmas at 0xffffffff9623ae3c mm/rmap.c:390:0
 # 9 [ffffa4481243fca8] free_pgtables at 0xffffffff96225231 mm/memory.c:406:0
 #10 [ffffa4481243fce8] exit_mmap at 0xffffffff96230f48 mm/mmap.c:3212:0
 #11 [ffffa4481243fd98] __mmput at 0xffffffff96097ab2 kernel/fork.c:969:0
 #12 [ffffa4481243fd98] mmput at 0xffffffff96097ab2 kernel/fork.c:990:0
 #13 [ffffa4481243fdb8] copy_process at 0xffffffff96098801 kernel/fork.c:2058:0
 #14 [ffffa4481243fea0] copy_process at 0xffffffff96099db9 kernel/fork.c:1653:0
 #15 [ffffa4481243fea0] _do_fork at 0xffffffff96099db9 kernel/fork.c:2153:0
 #16 [ffffa4481243ff20] SYSC_clone at 0xffffffff9609a149 kernel/fork.c:2265:0
 #17 [ffffa4481243ff20] SyS_clone at 0xffffffff9609a149 kernel/fork.c:2259:0
 #18 [ffffa4481243ff30] do_syscall_64 at 0xffffffff96003ca9 arch/x86/entry/common.c:298:0
 #19 [ffffa4481243ff58] entry_SYSCALL_64 at 0xffffffff96a001b1 arch/x86/entry/entry_64.S:238:0

Signed-off-by: Imran Khan <[email protected]>
imran-kn added a commit that referenced this pull request Apr 3, 2024
…lock(s).

We have ran into situtaions where optimistic spin on sleeping lock by
multiple CPUs causes high sys load, which further leads to other issues.

This helper allows to check for such spinners.

Some examples from a vmcore where system ran into fork-bomb:

>>> scan_osq_node(prog)
There are spinners on one or more osq_lock
CPU(s): [21, 17, 59, 67, 6, 0, 28, 53, 4, 41, 55, 38, 63, 23, 27, 19, 36] are spinning on same osq_lock
CPU(s): [42, 56, 35, 18, 22, 3, 1] are spinning on same osq_lock
CPU(s): [70, 44, 9] are spinning on same osq_lock
CPU(s): [16, 26] are spinning on same osq_lock

>>> scan_osq_node(prog,1)
There are spinners on one or more osq_lock
CPU(s): [21, 17, 59, 67, 6, 0, 28, 53, 4, 41, 55, 38, 63, 23, 27, 19, 36] are spinning on same osq_lock
CPU: 21 has been spinning for 1.2 seconds.
CPU: 17 has been spinning for 1.3 seconds.
CPU: 59 has been spinning for 1.3 seconds.
CPU: 67 has been spinning for 1.5 seconds.
.......................................
.......................................
CPU: 19 has been spinning for 3.004 seconds.
CPU: 36 has been spinning for 3.1 seconds.
......................................

>>> scan_osq_node(prog,2)

There are spinners on one or more osq_lock
CPU(s): [21, 17, 59, 67, 6, 0, 28, 53, 4, 41, 55, 38, 63, 23, 27, 19, 36] are spinning on same osq_lock
CPU: 21 has been spinning for 0.0 seconds.

Call stack at cpu:  21
PID: 64691    TASK: ffff8fac01282f80 [R] CPU: 21!  COMMAND: "bash"
!# 0 [ffffa447d645fb58] node_cpu at 0xffffffff960f3748 kernel/locking/osq_lock.c:27:0
!# 1 [ffffa447d645fb58] osq_lock at 0xffffffff960f3748 kernel/locking/osq_lock.c:143:0
 # 2 [ffffa447d645fb68] rwsem_optimistic_spin at 0xffffffff96899cc0 kernel/locking/rwsem-xadd.c:451:0
 # 3 [ffffa447d645fb68] __rwsem_down_write_failed_common at 0xffffffff96899cc0 kernel/locking/rwsem-xadd.c:529:0
 # 4 [ffffa447d645fb68] rwsem_down_write_failed at 0xffffffff96899cc0 kernel/locking/rwsem-xadd.c:617:0
 # 5 [ffffa447d645fbf8] call_rwsem_down_write_failed at 0xffffffff96889537 arch/x86/lib/rwsem.S:117:0
 # 6 [ffffa447d645fc40] __down_write at 0xffffffff96898bcd arch/x86/include/asm/rwsem.h:142:0
 # 7 [ffffa447d645fc40] down_write at 0xffffffff96898bcd kernel/locking/rwsem.c:72:0
 # 8 [ffffa447d645fc58] lock_anon_vma_root at 0xffffffff9623ae3c mm/rmap.c:239:0
 # 9 [ffffa447d645fc58] unlink_anon_vmas at 0xffffffff9623ae3c mm/rmap.c:390:0
 #10 [ffffa447d645fca8] free_pgtables at 0xffffffff96225231 mm/memory.c:406:0
 #11 [ffffa447d645fce8] exit_mmap at 0xffffffff96230f48 mm/mmap.c:3212:0
 #12 [ffffa447d645fd98] __mmput at 0xffffffff96097ab2 kernel/fork.c:969:0
 #13 [ffffa447d645fd98] mmput at 0xffffffff96097ab2 kernel/fork.c:990:0
 #14 [ffffa447d645fdb8] copy_process at 0xffffffff96098801 kernel/fork.c:2058:0
 #15 [ffffa447d645fea0] copy_process at 0xffffffff96099db9 kernel/fork.c:1653:0
 #16 [ffffa447d645fea0] _do_fork at 0xffffffff96099db9 kernel/fork.c:2153:0
 #17 [ffffa447d645ff20] SYSC_clone at 0xffffffff9609a149 kernel/fork.c:2265:0
 #18 [ffffa447d645ff20] SyS_clone at 0xffffffff9609a149 kernel/fork.c:2259:0
 #19 [ffffa447d645ff30] do_syscall_64 at 0xffffffff96003ca9 arch/x86/entry/common.c:298:0
 #20 [ffffa447d645ff58] entry_SYSCALL_64 at 0xffffffff96a001b1 arch/x86/entry/entry_64.S:238:0
.......................................
.......................................
CPU: 36 has been spinning for 3.1 seconds.

Call stack at cpu:  36
PID: 16329    TASK: ffff8f85db930000 [R] CPU: 36!  COMMAND: bash
 0 [ffffa4481243fb68] rwsem_try_write_lock_unqueued at 0xffffffff96899d00 kernel/locking/rwsem-xadd.c:364:0
 1 [ffffa4481243fb68] rwsem_optimistic_spin at 0xffffffff96899d00 kernel/locking/rwsem-xadd.c:465:0
 2 [ffffa4481243fb68] __rwsem_down_write_failed_common at 0xffffffff96899d00 kernel/locking/rwsem-xadd.c:529:0
 3 [ffffa4481243fb68] rwsem_down_write_failed at 0xffffffff96899d00 kernel/locking/rwsem-xadd.c:617:0
 # 4 [ffffa4481243fbf8] call_rwsem_down_write_failed at 0xffffffff96889537 arch/x86/lib/rwsem.S:117:0
 # 5 [ffffa4481243fc40] __down_write at 0xffffffff96898bcd arch/x86/include/asm/rwsem.h:142:0
 # 6 [ffffa4481243fc40] down_write at 0xffffffff96898bcd kernel/locking/rwsem.c:72:0
 # 7 [ffffa4481243fc58] lock_anon_vma_root at 0xffffffff9623ae3c mm/rmap.c:239:0
 # 8 [ffffa4481243fc58] unlink_anon_vmas at 0xffffffff9623ae3c mm/rmap.c:390:0
 # 9 [ffffa4481243fca8] free_pgtables at 0xffffffff96225231 mm/memory.c:406:0
 #10 [ffffa4481243fce8] exit_mmap at 0xffffffff96230f48 mm/mmap.c:3212:0
 #11 [ffffa4481243fd98] __mmput at 0xffffffff96097ab2 kernel/fork.c:969:0
 #12 [ffffa4481243fd98] mmput at 0xffffffff96097ab2 kernel/fork.c:990:0
 #13 [ffffa4481243fdb8] copy_process at 0xffffffff96098801 kernel/fork.c:2058:0
 #14 [ffffa4481243fea0] copy_process at 0xffffffff96099db9 kernel/fork.c:1653:0
 #15 [ffffa4481243fea0] _do_fork at 0xffffffff96099db9 kernel/fork.c:2153:0
 #16 [ffffa4481243ff20] SYSC_clone at 0xffffffff9609a149 kernel/fork.c:2265:0
 #17 [ffffa4481243ff20] SyS_clone at 0xffffffff9609a149 kernel/fork.c:2259:0
 #18 [ffffa4481243ff30] do_syscall_64 at 0xffffffff96003ca9 arch/x86/entry/common.c:298:0
 #19 [ffffa4481243ff58] entry_SYSCALL_64 at 0xffffffff96a001b1 arch/x86/entry/entry_64.S:238:0

Signed-off-by: Imran Khan <[email protected]>

 Conflicts:
	drgn_tools/lock.py
brenns10 pushed a commit that referenced this pull request Apr 4, 2024
…lock(s).

We have ran into situtaions where optimistic spin on sleeping lock by
multiple CPUs causes high sys load, which further leads to other issues.

This helper allows to check for such spinners.

Some examples from a vmcore where system ran into fork-bomb:

>>> scan_osq_node(prog)
There are spinners on one or more osq_lock
CPU(s): [21, 17, 59, 67, 6, 0, 28, 53, 4, 41, 55, 38, 63, 23, 27, 19, 36] are spinning on same osq_lock
CPU(s): [42, 56, 35, 18, 22, 3, 1] are spinning on same osq_lock
CPU(s): [70, 44, 9] are spinning on same osq_lock
CPU(s): [16, 26] are spinning on same osq_lock

>>> scan_osq_node(prog,1)
There are spinners on one or more osq_lock
CPU(s): [21, 17, 59, 67, 6, 0, 28, 53, 4, 41, 55, 38, 63, 23, 27, 19, 36] are spinning on same osq_lock
CPU: 21 has been spinning for 1.2 seconds.
CPU: 17 has been spinning for 1.3 seconds.
CPU: 59 has been spinning for 1.3 seconds.
CPU: 67 has been spinning for 1.5 seconds.
.......................................
.......................................
CPU: 19 has been spinning for 3.004 seconds.
CPU: 36 has been spinning for 3.1 seconds.
......................................

>>> scan_osq_node(prog,2)

There are spinners on one or more osq_lock
CPU(s): [21, 17, 59, 67, 6, 0, 28, 53, 4, 41, 55, 38, 63, 23, 27, 19, 36] are spinning on same osq_lock
CPU: 21 has been spinning for 0.0 seconds.

Call stack at cpu:  21
PID: 64691    TASK: ffff8fac01282f80 [R] CPU: 21!  COMMAND: "bash"
!# 0 [ffffa447d645fb58] node_cpu at 0xffffffff960f3748 kernel/locking/osq_lock.c:27:0
!# 1 [ffffa447d645fb58] osq_lock at 0xffffffff960f3748 kernel/locking/osq_lock.c:143:0
 # 2 [ffffa447d645fb68] rwsem_optimistic_spin at 0xffffffff96899cc0 kernel/locking/rwsem-xadd.c:451:0
 # 3 [ffffa447d645fb68] __rwsem_down_write_failed_common at 0xffffffff96899cc0 kernel/locking/rwsem-xadd.c:529:0
 # 4 [ffffa447d645fb68] rwsem_down_write_failed at 0xffffffff96899cc0 kernel/locking/rwsem-xadd.c:617:0
 # 5 [ffffa447d645fbf8] call_rwsem_down_write_failed at 0xffffffff96889537 arch/x86/lib/rwsem.S:117:0
 # 6 [ffffa447d645fc40] __down_write at 0xffffffff96898bcd arch/x86/include/asm/rwsem.h:142:0
 # 7 [ffffa447d645fc40] down_write at 0xffffffff96898bcd kernel/locking/rwsem.c:72:0
 # 8 [ffffa447d645fc58] lock_anon_vma_root at 0xffffffff9623ae3c mm/rmap.c:239:0
 # 9 [ffffa447d645fc58] unlink_anon_vmas at 0xffffffff9623ae3c mm/rmap.c:390:0
 #10 [ffffa447d645fca8] free_pgtables at 0xffffffff96225231 mm/memory.c:406:0
 #11 [ffffa447d645fce8] exit_mmap at 0xffffffff96230f48 mm/mmap.c:3212:0
 #12 [ffffa447d645fd98] __mmput at 0xffffffff96097ab2 kernel/fork.c:969:0
 #13 [ffffa447d645fd98] mmput at 0xffffffff96097ab2 kernel/fork.c:990:0
 #14 [ffffa447d645fdb8] copy_process at 0xffffffff96098801 kernel/fork.c:2058:0
 #15 [ffffa447d645fea0] copy_process at 0xffffffff96099db9 kernel/fork.c:1653:0
 #16 [ffffa447d645fea0] _do_fork at 0xffffffff96099db9 kernel/fork.c:2153:0
 #17 [ffffa447d645ff20] SYSC_clone at 0xffffffff9609a149 kernel/fork.c:2265:0
 #18 [ffffa447d645ff20] SyS_clone at 0xffffffff9609a149 kernel/fork.c:2259:0
 #19 [ffffa447d645ff30] do_syscall_64 at 0xffffffff96003ca9 arch/x86/entry/common.c:298:0
 #20 [ffffa447d645ff58] entry_SYSCALL_64 at 0xffffffff96a001b1 arch/x86/entry/entry_64.S:238:0
.......................................
.......................................
CPU: 36 has been spinning for 3.1 seconds.

Call stack at cpu:  36
PID: 16329    TASK: ffff8f85db930000 [R] CPU: 36!  COMMAND: bash
 0 [ffffa4481243fb68] rwsem_try_write_lock_unqueued at 0xffffffff96899d00 kernel/locking/rwsem-xadd.c:364:0
 1 [ffffa4481243fb68] rwsem_optimistic_spin at 0xffffffff96899d00 kernel/locking/rwsem-xadd.c:465:0
 2 [ffffa4481243fb68] __rwsem_down_write_failed_common at 0xffffffff96899d00 kernel/locking/rwsem-xadd.c:529:0
 3 [ffffa4481243fb68] rwsem_down_write_failed at 0xffffffff96899d00 kernel/locking/rwsem-xadd.c:617:0
 # 4 [ffffa4481243fbf8] call_rwsem_down_write_failed at 0xffffffff96889537 arch/x86/lib/rwsem.S:117:0
 # 5 [ffffa4481243fc40] __down_write at 0xffffffff96898bcd arch/x86/include/asm/rwsem.h:142:0
 # 6 [ffffa4481243fc40] down_write at 0xffffffff96898bcd kernel/locking/rwsem.c:72:0
 # 7 [ffffa4481243fc58] lock_anon_vma_root at 0xffffffff9623ae3c mm/rmap.c:239:0
 # 8 [ffffa4481243fc58] unlink_anon_vmas at 0xffffffff9623ae3c mm/rmap.c:390:0
 # 9 [ffffa4481243fca8] free_pgtables at 0xffffffff96225231 mm/memory.c:406:0
 #10 [ffffa4481243fce8] exit_mmap at 0xffffffff96230f48 mm/mmap.c:3212:0
 #11 [ffffa4481243fd98] __mmput at 0xffffffff96097ab2 kernel/fork.c:969:0
 #12 [ffffa4481243fd98] mmput at 0xffffffff96097ab2 kernel/fork.c:990:0
 #13 [ffffa4481243fdb8] copy_process at 0xffffffff96098801 kernel/fork.c:2058:0
 #14 [ffffa4481243fea0] copy_process at 0xffffffff96099db9 kernel/fork.c:1653:0
 #15 [ffffa4481243fea0] _do_fork at 0xffffffff96099db9 kernel/fork.c:2153:0
 #16 [ffffa4481243ff20] SYSC_clone at 0xffffffff9609a149 kernel/fork.c:2265:0
 #17 [ffffa4481243ff20] SyS_clone at 0xffffffff9609a149 kernel/fork.c:2259:0
 #18 [ffffa4481243ff30] do_syscall_64 at 0xffffffff96003ca9 arch/x86/entry/common.c:298:0
 #19 [ffffa4481243ff58] entry_SYSCALL_64 at 0xffffffff96a001b1 arch/x86/entry/entry_64.S:238:0

Signed-off-by: Imran Khan <[email protected]>

 Conflicts:
	drgn_tools/lock.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants