Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel panic trying to boot 5.7 #25

Open
xaiki opened this issue Oct 26, 2020 · 4 comments
Open

kernel panic trying to boot 5.7 #25

xaiki opened this issue Oct 26, 2020 · 4 comments

Comments

@xaiki
Copy link

xaiki commented Oct 26, 2020

I've tried building 5.7, and it panics on boot:

spi     - spi command
 Reset MT7530t image via network using TFTP protocol
set LAN/WAN WLLLLe data by tftp protocol
(Re)start USB...b-system
USB0:   mtk-xhci: init hccr be1c0000 and hcor be1c0020 hc_length 32
Register 300010f NbrPorts 3sion
Starting the controller
USB XHCI 0.96
scanning bus 0 for devices... 2 USB Device(s) found
       scanning bus for storage devices... 0 Storage Device(s) found
ethaddr="00:AA:BB:CC:DD:10"
 No USB Storage found. Upgrade FW failed!
serverip=192.168.1.2
Please choose the operation:
   1: Load system code to SDRAM via TFTP.
   2: Load system code then write to Flash via TFTP.
   3: Boot system code via Flash (default).
   4: Enter boot command line interface.
   5: Load system code then write to Flash via USB Storage.
   6: Load system code then write to Flash via Httpd.
   9: Load U-Boot code then write to Flash via TFTP.

You chose 3
                                                                                                                      0


3: System Boot system code via Flash.
## Checking image at bc050000 ...
   Image Name:   Linux-5.7.2+
   Image Type:   MIPS Linux Kernel Image (uncompressed)
   Data Size:    21383376 Bytes = 20.4 MB
   Load Address: 80001000
   Entry Point:  806a1140
   Verifying Checksum ... OK
OK
No initrd
## Transferring control to Linux (at address 806a1140) ...
## Giving linux memsize in MB, 512

Starting kernel ...

[    0.000000] Linux version 5.7.2+ (xaiki@sucre) (gcc version 9.3.0 (Debian 9.3.0-8), GNU ld (GNU Binutils for Debian) 2.35) #6 SMP Mon Oct 26 09:11:16 -03 2020
[    0.000000] SoC Type: MediaTek MT7621 ver:1 eco:3
[    0.000000] printk: bootconsole [early0] enabled
[    0.000000] CPU0 revision is: 0001992f (MIPS 1004Kc)
[    0.000000] MIPS: machine is GB-PC2
[    0.000000] Initrd not found or empty - disabling initrd
[    0.000000] VPE topology {2,2} total 4
[    0.000000] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
[    0.000000] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    0.000000] MIPS secondary cache 256kB, 8-way, linesize 32 bytes.
[    0.000000] Zone ranges:
[    0.000000]   Normal   [mem 0x0000000000000000-0x000000001bffffff]
[    0.000000]   HighMem  [mem 0x000000001c000000-0x0000000023ffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000000000-0x000000001bffffff]
[    0.000000]   node   0: [mem 0x0000000020000000-0x0000000023ffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000023ffffff]
[    0.000000] percpu: Embedded 14 pages/cpu s26832 r8192 d22320 u57344
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 130176
[    0.000000] Kernel command line: console=ttyS0,57600 rootfstype=squashfs,jffs2
[    0.000000] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes, linear)
[    0.000000] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes, linear)
[    0.000000] Writing ErrCtl register=00020002
[    0.000000] Readback ErrCtl register=00020002
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Memory: 496808K/524288K available (6815K kernel code, 244K rwdata, 1492K rodata, 13388K init, 238K bss, 27480K reserved, 0K cma-reserved, 65536K highmem)
[    0.000000] SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[    0.000000] rcu: Hierarchical RCU implementation.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
[    0.000000] NR_IRQS: 256
[    0.000000] random: get_random_bytes called from start_kernel+0x394/0x594 with crng_init=0
[    0.000000] CPU Clock: 900MHz
[    0.000000] clocksource: GIC: mask: 0xffffffffffffffff max_cycles: 0xcf914c9718, max_idle_ns: 440795231327 ns
[    0.000000] clocksource: MIPS: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 4247245437 ns
[    0.000010] sched_clock: 32 bits at 450MHz, resolution 2ns, wraps every 4772186110ns
[    0.015504] Calibrating delay loop... 597.60 BogoMIPS (lpj=2988032)
[    0.087786] pid_max: default: 32768 minimum: 301
[    0.097126] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[    0.111535] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[    0.129685] rcu: Hierarchical SRCU implementation.
[    0.142105] smp: Bringing up secondary CPUs ...
[    0.152780] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
[    0.152792] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    0.152806] MIPS secondary cache 256kB, 8-way, linesize 32 bytes.
[    0.152891] CPU1 revision is: 0001992f (MIPS 1004Kc)
[    0.211339] Synchronize counters for CPU 1: done.
[    0.282034] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
[    0.282043] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    0.282053] MIPS secondary cache 256kB, 8-way, linesize 32 bytes.
[    0.282107] CPU2 revision is: 0001992f (MIPS 1004Kc)
[    0.331989] Synchronize counters for CPU 2: done.
[    0.393144] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
[    0.393153] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    0.393162] MIPS secondary cache 256kB, 8-way, linesize 32 bytes.
[    0.393215] CPU3 revision is: 0001992f (MIPS 1004Kc)
[    0.451643] Synchronize counters for CPU 3: done.
[    0.511262] smp: Brought up 1 node, 4 CPUs
[    0.520139] devtmpfs: initialized
[    0.530528] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
[    0.550011] futex hash table entries: 1024 (order: 3, 32768 bytes, linear)
[    0.563618] CPU 0 Unable to handle kernel paging request at virtual address 00000000, epc == 80092f28, ra == 80092f18
[    0.584696] Oops[#1]:
[    0.589161] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.2+ #6
[    0.600904] $ 0   : 00000000 00000001 00000000 9bc48000
[    0.611271] $ 4   : 00000000 00000000 00000000 00000000
[    0.621639] $ 8   : fffffff2 9bc51e28 9bd60000 00000400
[    0.632006] $12   : 9bc51b8c 00000043 815a0000 0000000f
[    0.642374] $16   : 80830000 00000001 80830370 80830000
[    0.652743] $20   : 81570000 80886d04 8085d094 00000008
[    0.663110] $24   : 00000001 00000000
[    0.673478] $28   : 9bc50000 9bc51de0 80886d24 80092f18
[    0.683847] Hi    : 00000000
[    0.689549] Lo    : 00000000
[    0.695283] epc   : 80092f28 cmpxchg_futex_value_locked+0x28/0x64
[    0.707354] ra    : 80092f18 cmpxchg_futex_value_locked+0x18/0x64
[    0.719444] Status: 11008403 KERNEL EXL IE
[    0.727739] Cause : 40800008 (ExcCode 02)
[    0.735687] BadVA : 00000000
[    0.741390] PrId  : 0001992f (MIPS 1004Kc)
[    0.749510] Modules linked in:
[    0.755564] Process swapper/0 (pid: 1, threadinfo=(ptrval), task=(ptrval), tls=00000000)
[    0.771629] Stack : 81570000 80886d04 8085d094 00000008 80886d24 80867c60 ffbc1930 0043e6cf
[    0.788217]         80830000 80838470 00000000 9bc51e24 00000000 00000400 00000400 8085d094
[    0.804806]         00000008 0000000a 80886d04 df56f1a8 ffffffff 80867bac ffffffff 80830000
[    0.821394]         00000000 80001580 00000000 9bc51e4c 9bc51e4c df56f1a8 808238a0 00000000
[    0.837983]         80830000 8085d000 8081ea08 00000001 00000001 00000000 80830000 80780000
[    0.854572]         ...
[    0.859413] Call Trace:
[    0.864258] [<80092f28>] cmpxchg_futex_value_locked+0x28/0x64
[    0.875684] [<80867c60>] futex_init+0xb4/0x128
[    0.884475] [<80001580>] do_one_initcall+0x8c/0x1c4
[    0.894151] [<8085df54>] kernel_init_freeable+0x22c/0x264
[    0.904883] [<806a131c>] kernel_init+0x14/0xfc
[    0.913682] [<800068d8>] ret_from_kernel_thread+0x14/0x1c
[    0.924391] Code: 2408fff2  00001025  0000000f <c0a30000> 14660005  00000000  00e00825  e0a10000  1020fff9
[    0.943737]
[    0.946712] ---[ end trace 6d6af4ce27cfef83 ]---
[    0.955841] Kernel panic - not syncing: Fatal exception
[    0.966239] Rebooting in 1 seconds..
[    3.629481] Reboot failed -- System halted
@neilbrown
Copy link
Owner

Thanks for trying this out and reporting the results!
I had a look at the code in kernel/futex.c and the only place that futex_init() calls cmpxchg_futex_value_locked() is in futex_detect_cmpxchg(). It is passed a NULL, and the error seems to be a null dereference, but the NULL is clearly intentional and has been there for a long time.
I would probably try to git-bisect and find the patch which causes it to stop working. It can be a slow laborious process.

@xaiki
Copy link
Author

xaiki commented Oct 27, 2020

i've tried your 5.6 branch but it got me to the same point (i remember succesfully runing it before) so I looked a bit more into it, the NULL call to cmpxchg_futex_value_locked is:

        pagefault_disable();
	ret = futex_atomic_cmpxchg_inatomic(curval, uaddr, uval, newval);
	pagefault_enable();

I couldn't find where pagefault_disable() is defined.

the pagefault (that is intentional) should not happen, the comentay says:

	/*
	 * This will fail and we want it. Some arch implementations do
	 * runtime detection of the futex_atomic_cmpxchg_inatomic()
	 * functionality. We want to know that before we call in any
	 * of the complex code paths. Also we want to prevent
	 * registration of robust lists in that case. NULL is
	 * guaranteed to fault and we get -EFAULT on functional
	 * implementation, the non-functional ones will return
	 * -ENOSYS.
	 */

i've hacked that call to always return futex_cmpxchg_enabled = 0; but 5.9.1 hanged later in the boot, i'm now retrying with your 5.7.2

@xaiki
Copy link
Author

xaiki commented Oct 27, 2020

same happens with 5.7.2, with my hack it boots, but hangs before passing me to the busybox shell.

@neilbrown
Copy link
Owner

There is a report of similar problem on
http://groups.google.com/group/gnubee/t/b21f65a820e43b62
which was resolved by using a different version of the compiler.
I'm using gcc-7.2.0 and bin-utils 2.29.1.20170915 without problems. I can build a boot 5.10.1 without this crash.
What versions are you using?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants