Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IW3 - Any possibility of a realtime mode? #319

Open
IkariDevGIT opened this issue Mar 3, 2025 · 16 comments
Open

IW3 - Any possibility of a realtime mode? #319

IkariDevGIT opened this issue Mar 3, 2025 · 16 comments

Comments

@IkariDevGIT
Copy link

I really think adding a real-time mode could be a solid addition. I don’t know all the technical details on how it would work under the hood, but the benefits are definitely there. Here are a few ways it could be useful:

  • Streaming without having to wait for a full video to download, this could work for a series, movies, anime, YouTube videos, or whatever else someone wants to watch.
  • Flatscreen gaming.

To give some context on how this would even be used, there’s a program called Virtual Desktop that lets you remote into your PC using VR. It has an SBS mode, which means you can already watch anything on your PC in 3D if the output is SBS. If real-time support were possible, this could make that everything you see on your pc into 3D.

I actually think it would be pretty practical. I only have 12GB of VRAM and still manage to get around 26 FPS with IW3, and that’s with just a slight adjustment to the default config.

Not sure how realistic this is, but I’d love to hear your thoughts on it!

@nagadomi
Copy link
Owner

nagadomi commented Mar 4, 2025

If you specify Video Format=mkv, you can playback the video during conversion.
And SKYBOX and Pigasus support SMB (Windows Shared Folder).
By playback mkv via SMB, the video during conversion can be playback in near realtime.

However, those who want realtime actually expect to be able to choose the video they want to play from the VR headset.
So I think that the request regarding realtime is about UI and what is needed is the development of a video player or DLNA server.

@IkariDevGIT
Copy link
Author

@nagadomi Thank you for the response!

What I’m suggesting isn’t about watching a file while it’s being converted. It’s rather about applying the depth estimation and 3D conversion in real-time to anything displayed on a PC screen, including games, web videos, or live streams. Essentially, the idea would be to process frames dynamically as they appear, rather than converting a pre-existing video file.

The reason I brought up Virtual Desktop is that it already allows you to stream your PC screen to VR with SBS support. If IW3 had a real-time mode, it could theoretically take the live video feed from a PC screen and apply its depth-based conversion before sending it to VR, effectively turning anything you see into 3D.

@nagadomi
Copy link
Owner

nagadomi commented Mar 4, 2025

Maybe it is technically possible.

  • ffmpeg can take screen recordings as input and output realtime stream such as RTMP.
  • Since iw3 uses ffmpeg bindings, it can input the desktop stream, convert it to 3D, and output another stream.
  • Several VR video players can play streaming video. (The oculus browser's HTML5 video player supports SBS display, so it may be possible to play streaming video with it.)

However, the results other than full screen playback maybe not good.

@nagadomi
Copy link
Owner

nagadomi commented Mar 5, 2025

If it's a toy level (HTML5 Video player with MJPEG stream, maybe laggy),
I can develop it in a day, I'll give it a try.

@nagadomi
Copy link
Owner

nagadomi commented Mar 5, 2025

I implemented this.
It works better than I expected, but when I look at GUI windows other than full screen video, it is slightly distorted and makes me 3D sickness.

@nagadomi
Copy link
Owner

nagadomi commented Mar 5, 2025

If anyone gets interested, please give it a try.

switch to the dev branch first.
https://github.com/nagadomi/nunif/blob/dev/windows_package/docs/README.md#dev-branch

then see
https://github.com/nagadomi/nunif/blob/dev/iw3/docs/desktop.md

@IkariDevGIT
Copy link
Author

I tested it, and I’m genuinely impressed that you were able to implement this in just one day, it works pretty good!

I had a couple of questions and suggestions regarding performance optimizations:

  1. Batch Size for Real-Time Mode – Is batch size adjustable in real-time mode? Allowing it to process multiple frames at once could help improve performance on lower-end GPUs, even if it introduces some latency.

  2. Frame Interpolation for Depth Processing – A potential optimization could be an option to process depth estimation only every n frames (e.g., every 2 frames) and interpolate the depth information between them. This would likely introduce some latency but could significantly improve FPS while maintaining reasonable depth accuracy.

Let me know what you think! I’d be happy to test any further improvements.

@IkariDevGIT
Copy link
Author

Little follow-up to my last message:

I wanted to propose an additional optimization that could help improve performance in real-time mode. Instead of processing depth estimation for every frame, an adaptive motion-based depth frame skipping mechanism could be implemented.

The idea is to dynamically adjust depth processing frequency based on motion intensity:

  • High Motion: If significant movement is detected between consecutive frames, depth estimation would be computed normally for each frame.
  • Low/No Motion: If little to no movement is detected, instead of recalculating depth, the previous depth frame would be reused or interpolated between the depth frames for rendering, reducing the computational load.

Combining motion-based depth frame skipping + depth frame skipping with with interpolation + proper batch size implementation for real-time mode (may already exist, but my testing didn’t yield noticeable improvements) could lead to significantly better performance and higher FPS.

@nagadomi
Copy link
Owner

nagadomi commented Mar 6, 2025

I will consider improving the FPS, but when using Any_V2_S, it seems that the performance of depth estimation model is not a bottleneck.

On my machine(RTX3070Ti), the performance of Any_V2_S(resolution=392) itself can achieve about 120 FPS even with batch-size=1.
But with python -m iw3.desktop --depth-model Any_V2_S it only achieves about 14 FPS.

Increasing the batch size will be helpful for improvement because it can improve everything including device memory transfer, image resizing and warping, etc.

@IkariDevGIT
Copy link
Author

Wait, what? I'm barely getting 17 FPS in normal mode and around 7–9 FPS in desktop mode.

My specs:
CPU: Intel Core i9-10850K
RAM: 16GB (3200MHz)
Storage: Installed on an SSD
GPU: RTX 3060

Any idea what could be causing this?

@nagadomi
Copy link
Owner

nagadomi commented Mar 6, 2025

I made some minor improvements to iw3.desktop. 14FPS->24FPS. JPEG encoding was slow, I changed that.


120 FPS is the performance of the depth estimation model itself, not the video conversion.

def _bench():
import time
B = 4
N = 100
model = DepthAnythingModel("Any_L")
model.load(gpu=0)
x = torch.randn((B, 3, 392, 392)).cuda()
model.infer(x)
torch.cuda.synchronize()
with torch.no_grad():
t = time.time()
for _ in range(N):
model.infer(x)
torch.cuda.synchronize()
print(round(1.0 / ((time.time() - t) / (B * N)), 4), "FPS")

I changed B = 4 -> B = 1, Any_L -> Any_V2_S,
then

python -m iw3.depth_anything_model

Video conversion speed depends on the resolution of the input video, but can achieve 30 FPS(*1) for HD and 100 FPS for SD on my machine.
*1: 30FPS is same result as Any_B, most of the processing time is other than depth estimation

If you think your env is too slow, check the following.

  • Depth Model: Any_V2_S
  • Depth Batch Size: 4
  • Worker Threads: 4
  • FP16: ON
  • Stream: ON
  • TTA: OFF
  • Low VRAM: OFF

other settings are default values

@IkariDevGIT
Copy link
Author

After implementing these changes, my performance has improved to around 15-17 FPS on average, already a good improvement. Do you have any additional ideas for further optimizing performance?

Also, I believe the following features would greatly enhance usability, particularly when the user is not directly in front of the PC:

  • Simple controls: Basic input methods such as mouse clicks, scrolling and keyboard inputs would make it much more easier to use.
  • Audio transmission: The ability to transmit audio would be a valuable addition.

@Salmaun321
Copy link

Hi, first of all, nice work!

I can get over 800fps in the benchmark if I set batch size = 8, but only about 27fps streaming, my specs:
CPU: Intel Core i5-13600K
RAM: 32GB
GPU: RTX 4090

it seems that changing the preset, model or CRF etc, makes little to no effect

@nagadomi
Copy link
Owner

nagadomi commented Mar 7, 2025

I parallelized screenshot and 3D conversion. on my machine improved 24FPS -> 48FPS.
And probably due to browser limitations, video updates will not perform above 30 FPS, so this should be sufficient performance.

@Salmaun321

it seems that changing the preset, model or CRF etc, makes little to no effect

iw3.desktop uses JPEG sequential images(MJPEG) not video codec, so those options are ignored.
You can only specify --stream-quality options for MJPEG. --stream-quality 90 by default. When lower quality value is specified, network traffic can be reduced.

EDIT:
If you want to change the depth model, use --depth-model option. e.g, --depth-model Any_L
https://github.com/nagadomi/nunif/blob/dev/iw3/docs/desktop.md#stereo-setting

@Salmaun321
Copy link

Salmaun321 commented Mar 8, 2025

That last update really helped, I'm getting ~60fps. It is still very stuttery, but much better than before.
I'm using the Pico 4 default browser, now the issue is that it doesn't get the right aspect ratio, I always get a square image no matter if I set half or full sbs.

@loawizard
Copy link

loawizard commented Mar 9, 2025

this is dream come true for me! I only have 14 streaming frames though on 4070ti. this is the same if i stream localy on my pc and also while watching on my browser app in mibox. Any way i can increase that stream speed. i'm currently using command :

python -m iw3.desktop --depth-model Any_V2_S --divergence 2.75 --convergence 0.7

Thank you so much for doing this. if i can get it to 20-25 frames that would be amazing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants