Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

INVARIANTS panic in intel_engine_init_cmd_parser #282

Closed
emaste opened this issue Jan 25, 2024 · 7 comments
Closed

INVARIANTS panic in intel_engine_init_cmd_parser #282

emaste opened this issue Jan 25, 2024 · 7 comments

Comments

@emaste
Copy link
Member

emaste commented Jan 25, 2024

Describe the bug
panic: node is already on list or was not zeroed immediately upon boot

[drm] Initialized 5 GT workarounds on global
[drm] Initialized 8 engine workarounds on rcs'0
[drm] Initialized 5 whitelist workarounds on rcs'0
[drm] Initialized 14 context workarounds on rcs'0
panic: node is already on list or was not zeroed
cpuid = 7
time = 1706190275
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00dc5a04e0
vpanic() at vpanic+0x132/frame 0xfffffe00dc5a0610
panic() at panic+0x43/frame 0xfffffe00dc5a0670
intel_engine_init_cmd_parser() at intel_engine_init_cmd_parser+0x5de/frame 0xfffffe00dc5a06e0
intel_engines_init() at intel_engines_init+0x374/frame 0xfffffe00dc5a0760
intel_gt_init() at intel_gt_init+0x177/frame 0xfffffe00dc5a0790
i915_gem_init() at i915_gem_init+0x95/frame 0xfffffe00dc5a07d0
i915_driver_probe() at i915_driver_probe+0xeaa/frame 0xfffffe00dc5a0830
i915_pci_probe() at i915_pci_probe+0xa3/frame 0xfffffe00dc5a0890
linux_pci_attach_device() at linux_pci_attach_device+0x474/frame 0xfffffe00dc5a08e0
device_attach() at device_attach+0x3c5/frame 0xfffffe00dc5a0920
device_probe_and_attach() at device_probe_and_attach+0x70/frame 0xfffffe00dc5a0950
bus_generic_driver_added() at bus_generic_driver_added+0x77/frame 0xfffffe00dc5a0970
devclass_driver_added() at devclass_driver_added+0x3f/frame 0xfffffe00dc5a09b0
devclass_add_driver() at devclass_add_driver+0x138/frame 0xfffffe00dc5a09f0
_linux_pci_register_driver() at _linux_pci_register_driver+0xc1/frame 0xfffffe00dc5a0a20
i915kms_evh() at i915kms_evh+0x223/frame 0xfffffe00dc5a0a50
module_register_init() at module_register_init+0xb6/frame 0xfffffe00dc5a0a80
linker_load_module() at linker_load_module+0xc1f/frame 0xfffffe00dc5a0d80
kern_kldload() at kern_kldload+0x16f/frame 0xfffffe00dc5a0dd0
sys_kldload() at sys_kldload+0x5c/frame 0xfffffe00dc5a0e00
amd64_syscall() at amd64_syscall+0x15e/frame 0xfffffe00dc5a0f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00dc5a0f30
--- syscall (304, FreeBSD ELF64, kldload), rip = 0xc6b7244bf7a, rsp = 0xc6b702bf4a8, rbp = 0xc6b702bfa20 ---

FreeBSD version
FreeBSD 15.0-CURRENT wipbsd-n267714-217416d818df GENERIC amd64
(This is my WIP branch with a number of changes but should not be related; this is the same kernel as in #280 which worked until sysctl -a)

PCI Info

vgapci0@pci0:0:2:0:     class=0x030000 rev=0x02 hdr=0x00 vendor=0x8086 device=0x3ea0 subvendor=0x17aa subdevice=0x2292
    vendor     = 'Intel Corporation'
    device     = 'WhiskeyLake-U GT2 [UHD Graphics 620]'
    class      = display
    subclass   = VGA

DRM KMOD version
From git, 1af4c68

To Reproduce
kldload i915kms

Screenshots
N/A

Additional context
Note this is GENERIC with INVARIANTS

@emaste
Copy link
Member Author

emaste commented Jan 27, 2024

The panic is from

#define hash_add_rcu(ht, node, key) do {                                \
        struct lkpi_hash_head *__head = &(ht)[hash_min(key, HASH_BITS(ht))]; \
        __hash_node_type_assert(node); \
        KASSERT(((struct lkpi_hash_entry *)(node))->entry.cle_prev == NULL, \
            ("node is already on list or was not zeroed")); \
        CK_LIST_INSERT_HEAD(&__head->head, \
            (struct lkpi_hash_entry *)(node), entry); \
} while (0)

which has been there since f9e90c24737f9

emaste added a commit to emaste/drm-kmod that referenced this issue Jan 27, 2024
@emaste
Copy link
Member Author

emaste commented Jan 28, 2024

Panic does not occur with emaste@b6ecd6f applied. I assume this code has just not been run w/ INVARIANTS previously.

@evadot
Copy link
Contributor

evadot commented Jan 28, 2024

Mhm weird, is kmalloc supposed to bzero in linux ?

@evadot
Copy link
Contributor

evadot commented Jan 28, 2024

So having looked at the Linux code it seems that they don't check this, also hash_add seems to use hlist and not the hash rcu ones that we do use.

@emaste
Copy link
Member Author

emaste commented Jan 28, 2024

Linux has kzalloc for a zeroed allocation, kmalloc does not. I suspect that the KASSERT in hash_add_rcu is not valid and should be removed (or, we decide it's actually a valuable check, and we have to modify callers to zero the whole allocation or at least the cle_prev)

@emaste
Copy link
Member Author

emaste commented Jan 29, 2024

freebsd-git pushed a commit to freebsd/freebsd-src that referenced this issue Jan 29, 2024
hash_add_rcu asserted that the node's prev pointer was NULL in an
attempt to detect addition of a node already on a list, but the caller
is not required to provide a zeroed node.

Reported in freebsd/drm-kmod#282

Reviewed by:	bz, manu
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D43645
@emaste
Copy link
Member Author

emaste commented Jan 29, 2024

Fixed by freebsd/freebsd-src@7e77089

@emaste emaste closed this as completed Jan 29, 2024
emaste added a commit to emaste/freebsd that referenced this issue Mar 22, 2024
hash_add_rcu asserted that the node's prev pointer was NULL in an
attempt to detect addition of a node already on a list, but the caller
is not required to provide a zeroed node.

Reported in freebsd/drm-kmod#282

Reviewed by:	bz, manu
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D43645

(cherry picked from commit 7e77089)
freebsd-git pushed a commit to freebsd/freebsd-src that referenced this issue Mar 25, 2024
hash_add_rcu asserted that the node's prev pointer was NULL in an
attempt to detect addition of a node already on a list, but the caller
is not required to provide a zeroed node.

Reported in freebsd/drm-kmod#282

Reviewed by:	bz, manu
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D43645

(cherry picked from commit 7e77089)
bsdjhb pushed a commit to bsdjhb/cheribsd that referenced this issue Aug 1, 2024
hash_add_rcu asserted that the node's prev pointer was NULL in an
attempt to detect addition of a node already on a list, but the caller
is not required to provide a zeroed node.

Reported in freebsd/drm-kmod#282

Reviewed by:	bz, manu
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D43645
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants