-
Notifications
You must be signed in to change notification settings - Fork 347
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: vaGetImage takes 90ms on each FullHD frame #1824
Comments
update: with i965 driver it reaches ~100fps, in other words the issue doesn't reproduce |
Auto Created VSMGWL-74602 for further analysis. |
Fix: intel#1824 Signed-off-by: Jay Yang <[email protected]>
I don't have UHD 605 in hand. May I know if #1837 can fix? |
Hi @MicroYY , I checked out the PR branch, made sure the last commit is b69c087bb96e9a6dc809e77aabf332f8dc9ae678, and rebuild the media driver with the master branches of libva and igdgmm. So I checked it on the UHD 605 machine, I'm sure the new driver have loaded, but the same issue persists: vaGetImage takes now ~80 ms. It's not 90-100 as it was before, but still way more then it should. |
I am seeing a similar problem on our systems with the following CPU:
I also see it with media driver version 24.3.0 and #1837
Would an IPS ticket help? |
Note that with i965 driver it reaches ~100fps, - the issue doesn't reproduce (on UHD 605). |
I will use the old libva driver for this use case for now, but I still want to get this to work as we need the media driver for other use cases. I ran this in perf like this using the vaapi prlugin:
and got this graph:
It looks similar when using the va plugin:
Both were done with Intel media driver 23.3.5. I see the problem with and without HuC and GuC. In Chromium media driver works fine. Watching VP9 Youtube videos is offloaded. IPS case: 00900208 |
@hauke-work how those perf reports are similar? in the first one there's a big memcpy which is, if I understand correctly, the big bottleneck, while in the second there isn't. |
@aslobodeniuk @hauke-work @ceyusa Sorry for replying so late. I Just checked the FFmpeg source codes: https://github.com/FFmpeg/FFmpeg/blob/master/libavutil/hwcontext_vaapi.c#L850-L897 The dump surface rules as blow :
From the call stack, it should go to 2# ==> it has significant SW latency. From DG2, libva provides a new HW GPU copy API: vaCopy. The app can use it for any GPU to CPU or CPU to GPU copy without any SW latency. On the other hand, we are optimizing vaGetImage on UMD by using GPU copy from MTL+ recently. Could you help confirm whether the current SW copy (vaGetImage) has any business impact on the current platform? Or does it only affect debugging (such as dump surfaces)? If it's only for debugging, I believe the issue will be resolved after MTL. We don't need any changes for Gen9-Gen12. If not, we still need to improve it. |
Which component impacted?
Video Processing
Is it regression? Good in old configuration?
This issue doesn't reproduce with i965 driver
What happened?
This happens on a certain hardware with UHD Graphics 605 GPU.
This issue seems to happen on all the versions of iHD_driver, we checked on 20.1.1 iHD and 23.4.1 .
Reproduces with both ffmpeg and gstreamer (all latest versions), and any Full HD video.
How to reproduce:
so in the output of ffmpeg we can see it only reaches ~10 fps.
Same 10 fps are reached if we download with gstreamer vah264dec element
Checking the libva traces we can see that the slowest part is the vaGetImage, it always takes around 90-100ms
Meanwhile without downloading to CPU memory the playback of the same file can reach 700fps.
To give an approximate benchmark of the CPU - software decoding of the same file reaches 300 fps, so it's not that incredibly slow.
Do you know a way to confirm it's a hardware or a driver issue?
What's the usage scenario when you are seeing the problem?
Playback
What impacted?
No response
Debug Information
lshw -C display
cat /proc/cpuinfo
Do you want to contribute a patch to fix the issue?
None
The text was updated successfully, but these errors were encountered: