Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't get LibVF to pick up after reboot #63

Closed
jon-bit opened this issue Feb 2, 2023 · 13 comments
Closed

Can't get LibVF to pick up after reboot #63

jon-bit opened this issue Feb 2, 2023 · 13 comments

Comments

@jon-bit
Copy link

jon-bit commented Feb 2, 2023

I'm using fedora 37 and when I install LibVF I reboot and then the script does NOT pickup where I left off. It starts the process all over again. I have no clue what I'm doing wrong and I've tried fixing it for days. Can anyone help? I'm puting a pastbin of the output below.

https://pastebin.com/BaeC1jUv

Any help is nice so thank you in advance.

@arthurrasmusson
Copy link
Contributor

arthurrasmusson commented Feb 2, 2023

Hey @jon-bit. Thanks for pointing this out - I'll try my best to help you with whatever is going on here.

Can you start by running ./scripts/generate-debug-information.sh then dump the resulting ./logs/debug.log for me?

Thanks.

@jon-bit
Copy link
Author

jon-bit commented Feb 2, 2023

I ran it again because I rebooted and saw this

Failed to disable unit: Unit file nvidia-vgpud.service does not exist.
Failed to stop nvidia-vgpud.service: Unit nvidia-vgpud.service not loaded.

but if that log is needed Here it is

https://pastebin.com/PChqHd3u

@arthurrasmusson
Copy link
Contributor

arthurrasmusson commented Feb 2, 2023

Delete your debug.log file then run sudo systemctl restart gvm-post.service after reboot. If you can re-run the debug log script and post it after doing that I'll take a look.

I suspect this is your issue:
Arc-Compute/GVM-user#3

You can also reach me in the Open-IOV Community Group Chat - I'll probably be able to help you more quickly in there:
https://discord.gg/Rb9K9DYxKK

@jon-bit
Copy link
Author

jon-bit commented Feb 2, 2023

I ran the restart on gvm and got

Failed to restart gvm-post.servic.service: Unit gvm-post.servic.service not found.

Here is the output of debug.log

https://pastebin.com/tw3fsWa8

@mbuchel
Copy link
Collaborator

mbuchel commented Feb 2, 2023

you typed in the restart command wrong, try just:

sudo systemctl restart gvm-post

it should autocomplete for you

@mbuchel
Copy link
Collaborator

mbuchel commented Feb 2, 2023

also you have a 3070, the support for ampere is still super experimental

@jon-bit
Copy link
Author

jon-bit commented Feb 2, 2023

OK I just saw this in the logs

Device Check Test succeeded
Device Major Number Identifier Null succeeded
Device Major Number Identifier succeeded
DeviceFileMode Nvidia Check succeeded
Open /dev/nvidiactl File succeeded
Open /dev/nvidia0 File succeeded
All tests succeeded
# GVM-user TEST-DEVICE END
# GVM-user TEST-NVIDIA-API START
RM Version Check Ok: Ensure your version check is correctly: 510.47.03
RM Version Check Incorrect: Invalid version is reported correctly, please check your driver version.
RM Version Check Ignore succeeded
RM Alloc Root Ok succeeded
RM Alloc Root Fail (Invalid FD) succeeded
RM Alloc Root Fail (Bad FD) succeeded
RM Alloc Root Fail (Bad Argument) succeeded
RM Free Root Ok succeeded
RM Free Root Fail (Invalid FD) succeeded
RM Free Root Fail (Double Deallocate) succeeded
Get Probed Ids succeeded
2/11 tests failed
# GVM-user TEST-NVIDIA-API END
# GVM-user TEST-NVIDIA-MANAGER START
Created gpu: 0x00000100 (0x10DE, 0x2482, 0x10DE, 0x5052)
Failed RM Control Mechanism:
	client: 0xC1D00046
	object: 0xD0014603
	cmd: 0xA0810101
	flags: 0x00000000
	params: 0x7fff9a52c380
	size: 0x000011C0
	status: 0x0000003A
Destroyed gpu: 0x00000100 (0x10DE, 0x2482, 0x10DE, 0x5052)
Create MDevs succeeded
All tests succeeded
# GVM-user TEST-NVIDIA-MANAGER END
# UNAME START
Linux fedora 6.0.18-300.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Jan 7 17:10:00 UTC 2023 x86_64 GNU/Linux
# UNAME END
# CPUINFO START

@mbuchel
Copy link
Collaborator

mbuchel commented Feb 2, 2023

you are using 525.60, the libvf.io scripts may not have the update to support this yet, but it is kindof moot as vgpu unlock does not support 3070, the gvm suite will but it is not yet there. feel free to join the open-iov discord.

@jon-bit
Copy link
Author

jon-bit commented Feb 2, 2023

Sorry but I don't have discord. Regardless how do we fix this?

@mbuchel
Copy link
Collaborator

mbuchel commented Feb 2, 2023

you can install gvm-user utils and select nvidia/525.60 branch to compile it

https://github.com/Open-IOV/GVM-user/tree/nvidia/525

the issue will be handling the vgpu unlock for 3070 series, which is much more complex

@arthurrasmusson
Copy link
Contributor

@jon-bit See GPU Support on Open-IOV:
https://open-iov.org/index.php/GPU_Support

@arthurrasmusson
Copy link
Contributor

@jon-bit as this issue appears to originate from an unsupported device I'm going close this thread for now.

@jon-bit
Copy link
Author

jon-bit commented Feb 3, 2023

I'll respect the fact that this is not a supported card but do you know of any other software that could help with GPU virtualization?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants