Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System freeze (4.2-RC3) #8585

Closed
rapenne-s opened this issue Oct 9, 2023 · 4 comments
Closed

System freeze (4.2-RC3) #8585

rapenne-s opened this issue Oct 9, 2023 · 4 comments
Labels
affects-4.2 This issue affects Qubes OS 4.2. C: other hardware support P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. R: self-closed Voluntarily closed by the person who opened it before another resolution occurred.

Comments

@rapenne-s
Copy link

rapenne-s commented Oct 9, 2023

Qubes OS release

4.2-RC3 on a T470 and a Ryzen 5 5600X desktop

Brief summary

I initially though it was an issue with my nvme SSD on my desktop, depending on my workload, the system was soft locking, I could still move the cursor and switch virtual desktops, but nothing was responding. I got this three times in a row using a regular workload, top(1) that I opened earlier just in case shown a 99% wait access when this happen.

I just got the exact same problem on my laptop running the same Qubes OS version. This can't be a coincidence.

Maybe related to #8575

When this happen and I had a dom0 terminal opened, I could type a few commands, by anything like ls or df would wait indefinitely, like if the storage wasn't responding anymore.

@rapenne-s rapenne-s added P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. T: bug labels Oct 9, 2023
@rapenne-s
Copy link
Author

The Ryzen desktop is running linux 6.5.5-1-qubes.fc37, it has all testing updates enabled.

The laptop is running linux 6.1.41-1-qubes.fc37, it's only using stable updates.

@andrewdavidwong andrewdavidwong added C: other needs diagnosis Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed. affects-4.2 This issue affects Qubes OS 4.2. labels Oct 9, 2023
@rapenne-s
Copy link
Author

Further investigation, I can 100% reproduce it with two recent of a specific qube, but only on the ryzen system 🤔

  • older backups of the qube can be restored
  • the same backup can be restore on the other 4.2-RC3 system
  • other qubes can be restored correctly

I'm still investigating, but there is something wrong either:

  • in my LVM pool
  • with the SSD
  • with Xen
  • with Linux kernel

As the laptop had a similar issue but the reproducer doesn't work, it's really weird, but this would be Xen or a weird linux bug present in 6.1 and 6.5

@rapenne-s
Copy link
Author

So, I'll close this issue for now:

  • I can't replicate the behavior on the laptop and so far it worked fine under load
  • on dom0 on both computers, running while true; do dd if=/dev/urandom of=test.disk bs=10M count=200 ; rm test.disk ; end to write consistently on the system disk was reproducing a crash on the desktop after a minute, never on the laptop 🤷‍♀️
  • I erased the whole desktop disk with random data from another OS, no write error
  • I reinstalled 4.2-RC3 on the desktop, restored all the qubes, it's working fine so far

The desktop is running 6.1 though, instead of 6.5 initially, I'm not sure how to use a newer kernel on 4.2 🤔

My wild guess is that the two issues were coincidences, and the desktop problem was some kernel bug that was tied to lvm (maybe in an incorrect state?) and not the disk itself. 🤷‍♀️

@andrewdavidwong andrewdavidwong added R: self-closed Voluntarily closed by the person who opened it before another resolution occurred. and removed needs diagnosis Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed. labels Oct 11, 2023
@rustybird
Copy link

I'm not sure how to use a newer kernel on 4.2 🤔

The R4.2 dom0 testing repo has 6.5.x in the "kernel-latest" package.

If you are still able to reproduce this somewhere, that could be useful to diagnose #8575 and #8619! Assuming that it's all related.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-4.2 This issue affects Qubes OS 4.2. C: other hardware support P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. R: self-closed Voluntarily closed by the person who opened it before another resolution occurred.
Projects
None yet
Development

No branches or pull requests

3 participants