Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed single-file transcoding #262

Open
arcastro opened this issue Oct 17, 2023 · 22 comments
Open

Distributed single-file transcoding #262

arcastro opened this issue Oct 17, 2023 · 22 comments
Labels
enhancement New feature or request

Comments

@arcastro
Copy link

Is your feature request related to a problem? Please describe.
It's possible that the individual nodes in the cluster may not be powerful enough to transcode 4K video in real time. Maybe a single node can only transcode 5 seconds of "video time" in 10 seconds of "real time", not enough to keep up with continuous playback.

Describe the solution you'd like
Would be great if a single transcode job could be "chunked" and distributed amongst the cluster. For example, the video could be split into 5-second chunks, each sent to a node to transcode and then recombined by the orchestrator. In the example above, three nodes working together would be able to transcode 15 seconds of "video time" in 10 seconds of "real time", which is sufficient to enable continuous playback.

@arcastro arcastro added the enhancement New feature or request label Oct 17, 2023
Copy link

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Issue has been inactive for more than 30 days label Nov 17, 2023
Copy link

github-actions bot commented Dec 1, 2023

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 1, 2023
@pabloromeo pabloromeo reopened this Dec 1, 2023
@github-actions github-actions bot removed the stale Issue has been inactive for more than 30 days label Dec 2, 2023
Copy link

github-actions bot commented Jan 1, 2024

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Issue has been inactive for more than 30 days label Jan 1, 2024
Copy link

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 16, 2024
@pabloromeo pabloromeo reopened this Jan 16, 2024
@github-actions github-actions bot removed the stale Issue has been inactive for more than 30 days label Jan 17, 2024
Copy link

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Issue has been inactive for more than 30 days label Feb 17, 2024
Copy link

github-actions bot commented Mar 3, 2024

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 3, 2024
@pabloromeo pabloromeo reopened this Mar 3, 2024
@github-actions github-actions bot removed the stale Issue has been inactive for more than 30 days label Mar 4, 2024
Copy link

github-actions bot commented Apr 4, 2024

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Issue has been inactive for more than 30 days label Apr 4, 2024
Copy link

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Apr 19, 2024
@pabloromeo pabloromeo reopened this Apr 19, 2024
@arcastro
Copy link
Author

Would it be worthwhile to configure the excempt-issue-labels here:

The label could then be applied to this issue to prevent it auto-closing.

(I would also not be offended if you allow the issue to close as wont-do. I am not in terrible need of this feature, so please don't feel the need to keep it open for my sake 😊)

@github-actions github-actions bot removed the stale Issue has been inactive for more than 30 days label Apr 20, 2024
Copy link

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Issue has been inactive for more than 30 days label May 20, 2024
Copy link

github-actions bot commented Jun 3, 2024

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 3, 2024
@pabloromeo pabloromeo reopened this Jun 3, 2024
@github-actions github-actions bot removed the stale Issue has been inactive for more than 30 days label Jun 4, 2024
Copy link

github-actions bot commented Jul 4, 2024

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Issue has been inactive for more than 30 days label Jul 4, 2024
Copy link

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 18, 2024
@pabloromeo pabloromeo reopened this Jul 18, 2024
@github-actions github-actions bot removed the stale Issue has been inactive for more than 30 days label Jul 19, 2024
Copy link

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Issue has been inactive for more than 30 days label Aug 18, 2024
Copy link

github-actions bot commented Sep 1, 2024

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 1, 2024
@pabloromeo pabloromeo reopened this Sep 1, 2024
@github-actions github-actions bot removed the stale Issue has been inactive for more than 30 days label Sep 2, 2024
@FelixClements
Copy link

if i'm not mistaken the env setting STREAM_SPLITTING still exist? Does this setting not work or is it verry unstable?

@FelixClements
Copy link

so i tried to implement and it somewhat works but maybe @pabloromeo you could point me to the right direction. So from what i can tell it is able to send tasks to different servers but the last send tasks just keeps on server3? My fork is here: https://github.com/FelixClements/clusterplex

plex_orchestrator.1.j2cqglzqo0i6@server02    | ON_DEATH: debug mode enabled for pid [1]
plex_orchestrator.1.j2cqglzqo0i6@server02    | Initializing orchestrator
plex_orchestrator.1.j2cqglzqo0i6@server02    | Using Worker Selection Strategy: LOAD_TASKS
plex_orchestrator.1.j2cqglzqo0i6@server02    | Stream-Splitting: ENABLED
plex_orchestrator.1.j2cqglzqo0i6@server02    | Setting up websockets
plex_orchestrator.1.j2cqglzqo0i6@server02    | Ready
plex_orchestrator.1.j2cqglzqo0i6@server02    | Server listening on port 3500
plex_orchestrator.1.j2cqglzqo0i6@server02    | Client connected: cjB34ZUtXt7VccYwAAAB
plex_orchestrator.1.j2cqglzqo0i6@server02    | Registering worker 3914952c-8cf3-42ce-b006-1a4fd63f492a|plex-worker-server04
plex_orchestrator.1.j2cqglzqo0i6@server02    | Registered new worker: 3914952c-8cf3-42ce-b006-1a4fd63f492a|plex-worker-server04
plex_orchestrator.1.j2cqglzqo0i6@server02    | Client connected: QiHgJdawYKOyoNuyAAAD
plex_orchestrator.1.j2cqglzqo0i6@server02    | Registering worker a9c36c4d-ffee-4cd3-8496-c4995836c664|plex-worker-server02
plex_orchestrator.1.j2cqglzqo0i6@server02    | Registered new worker: a9c36c4d-ffee-4cd3-8496-c4995836c664|plex-worker-server02
plex_orchestrator.1.j2cqglzqo0i6@server02    | Client connected: 9yndKkzeOWSxHerZAAAF
plex_orchestrator.1.j2cqglzqo0i6@server02    | Registering worker 6d099096-c2c4-4456-90da-9ec77316d8ab|plex-worker-server03
plex_orchestrator.1.j2cqglzqo0i6@server02    | Registered new worker: 6d099096-c2c4-4456-90da-9ec77316d8ab|plex-worker-server03
plex_orchestrator.1.j2cqglzqo0i6@server02    | Client connected: svwp5Ag1FhX3iYbMAAAH
plex_orchestrator.1.j2cqglzqo0i6@server02    | Registering worker a106d06b-f6c5-44c5-8e95-6ecd30c4aa7e|plex-worker-server01
plex_orchestrator.1.j2cqglzqo0i6@server02    | Registered new worker: a106d06b-f6c5-44c5-8e95-6ecd30c4aa7e|plex-worker-server01
plex_orchestrator.1.j2cqglzqo0i6@server02    | Client connected: UAacu1rC39oBo0JRAAAJ
plex_orchestrator.1.j2cqglzqo0i6@server02    | Registered new job poster: 9f11430a-852d-4710-b6d3-090d7eb64546|3d4396666091
plex_orchestrator.1.j2cqglzqo0i6@server02    | Creating multiple tasks for the job
plex_orchestrator.1.j2cqglzqo0i6@server02    | All Args => -codec:0,h264,-codec:1,ac3,-analyzeduration,20000000,-probesize,20000000,-i,/data/path/to/file,-analyzeduration,20000000,-probesize,20000000,-i,/transcode/Transcode/Sessions/plex-transcode-f34792abc0a049d0-com-plexapp-android-889da67e-f296-4229-9e3d-3a898993bc9a/temp-0.srt,-filter_complex,[0:0]scale=w=480:h=240:force_divisible_by=4[0];[0]format=pix_fmts=yuv420p|nv12[1],-map,[1],-metadata:s:0,language=eng,-codec:0,libx264,-crf:0,22,-maxrate:0,541k,-bufsize:0,1082k,-r:0,23.975999999999999,-preset:0,veryfast,-x264opts:0,subme=2:me_range=4:rc_lookahead=10:me=dia:no_chroma_me:8x8dct=0:partitions=none,-force_key_frames:0,expr:gte(t,n_forced*8),-filter_complex,[0:1] aresample=async=1:ochl='stereo':rematrix_maxval=0.000000dB:osr=48000[2],-map,[2],-metadata:s:1,language=eng,-codec:1,libopus,-b:1,135k,-map,1:s:0,-metadata:s:2,language=eng,-codec:2,ass,-strict_ts:2,0,-map,0:t?,-codec:t,copy,-segment_format,matroska,-f,ssegment,-individual_header_trailer,0,-flags,+global_header,-segment_header_filename,header,-segment_time,8,-segment_start_number,0,-segment_copyts,1,-segment_time_delta,0.0625,-segment_list,http://server:32499/video/:/transcode/session/f34792abc0a049d0-com-plexapp-android/889da67e-f296-4229-9e3d-3a898993bc9a/manifest?X-Plex-Http-Pipeline=infinite,-segment_list_type,csv,-segment_list_size,5,-segment_list_separate_stream_times,1,-segment_list_unfinished,1,-segment_format_options,output_ts_offset=10,-max_delay,5000000,-avoid_negative_ts,disabled,-map_metadata:g,-1,-map_metadata:c,-1,-map_chapters,-1,media-%05d.ts,-start_at_zero,-copyts,-vsync,cfr,-y,-nostats,-loglevel,verbose,-loglevel_plex,verbose,-progressurl,http://server:32499/video/:/transcode/session/f34792abc0a049d0-com-plexapp-android/889da67e-f296-4229-9e3d-3a898993bc9a/progress
plex_orchestrator.1.j2cqglzqo0i6@server02    | Args => segment_time: 8, ss: NaN, min_seg_duration: 10, skip_to_segment: NaN, segment_start_number: 0
plex_orchestrator.1.j2cqglzqo0i6@server02    | Queueing job b0d28a73-dfe1-4bce-8b78-d53c0f3be321
plex_orchestrator.1.j2cqglzqo0i6@server02    | Queueing task a23b4633-34ad-4007-b2a0-ef04802adc67
plex_orchestrator.1.j2cqglzqo0i6@server02    | Running task a23b4633-34ad-4007-b2a0-ef04802adc67
plex_orchestrator.1.j2cqglzqo0i6@server02    | Forwarding work request to a106d06b-f6c5-44c5-8e95-6ecd30c4aa7e|plex-worker-server01
plex_orchestrator.1.j2cqglzqo0i6@server02    | Received update for task a23b4633-34ad-4007-b2a0-ef04802adc67, status: received
plex_orchestrator.1.j2cqglzqo0i6@server02    | Received update for task a23b4633-34ad-4007-b2a0-ef04802adc67, status: inprogress
plex_orchestrator.1.j2cqglzqo0i6@server02    | Client connected: cDnq_-Qfd_HFFhiiAAAL
plex_orchestrator.1.j2cqglzqo0i6@server02    | Registered new job poster: 04e21e59-6039-422d-af98-0c388acec035|3d4396666091
plex_orchestrator.1.j2cqglzqo0i6@server02    | Creating multiple tasks for the job
plex_orchestrator.1.j2cqglzqo0i6@server02    | All Args => -codec:0,h264,-codec:1,ac3,-ss,224,-analyzeduration,20000000,-probesize,20000000,-i,/data/path/to/file,-ss,224,-analyzeduration,20000000,-probesize,20000000,-i,/transcode/Transcode/Sessions/plex-transcode-f34792abc0a049d0-com-plexapp-android-e71a4d30-c14b-49b4-8d5e-3e3eee1a39e5/temp-0.srt,-filter_complex,[0:0]scale=w=480:h=240:force_divisible_by=4[0];[0]format=pix_fmts=yuv420p|nv12[1],-map,[1],-metadata:s:0,language=eng,-codec:0,libx264,-crf:0,22,-maxrate:0,541k,-bufsize:0,1082k,-r:0,23.975999999999999,-preset:0,veryfast,-x264opts:0,subme=2:me_range=4:rc_lookahead=10:me=dia:no_chroma_me:8x8dct=0:partitions=none,-force_key_frames:0,expr:gte(t,n_forced*8),-filter_complex,[0:1] aresample=async=1:ochl='stereo':rematrix_maxval=0.000000dB:osr=48000[2],-map,[2],-metadata:s:1,language=eng,-codec:1,libopus,-b:1,135k,-map,1:s:0,-metadata:s:2,language=eng,-codec:2,ass,-strict_ts:2,0,-map,0:t?,-codec:t,copy,-segment_format,matroska,-f,ssegment,-individual_header_trailer,0,-flags,+global_header,-segment_header_filename,header,-segment_time,8,-segment_start_number,28,-segment_copyts,1,-segment_time_delta,0.0625,-segment_list,http://server:32499/video/:/transcode/session/f34792abc0a049d0-com-plexapp-android/e71a4d30-c14b-49b4-8d5e-3e3eee1a39e5/manifest?X-Plex-Http-Pipeline=infinite,-segment_list_type,csv,-segment_list_size,5,-segment_list_separate_stream_times,1,-segment_list_unfinished,1,-segment_format_options,output_ts_offset=10,-max_delay,5000000,-avoid_negative_ts,disabled,-map_metadata:g,-1,-map_metadata:c,-1,-map_chapters,-1,media-%05d.ts,-start_at_zero,-copyts,-y,-nostats,-loglevel,verbose,-loglevel_plex,verbose,-progressurl,http://server:32499/video/:/transcode/session/f34792abc0a049d0-com-plexapp-android/e71a4d30-c14b-49b4-8d5e-3e3eee1a39e5/progress
plex_orchestrator.1.j2cqglzqo0i6@server02    | Args => segment_time: 8, ss: 224, min_seg_duration: 10, skip_to_segment: NaN, segment_start_number: 28
plex_orchestrator.1.j2cqglzqo0i6@server02    | Creating segment 1
plex_orchestrator.1.j2cqglzqo0i6@server02    | Queueing job 496dcab9-60e0-473a-aed3-1d8b26a3fcdb
plex_orchestrator.1.j2cqglzqo0i6@server02    | Queueing task 3e33a397-ab9e-4728-9260-f9df28eba93b
plex_orchestrator.1.j2cqglzqo0i6@server02    | Running task 3e33a397-ab9e-4728-9260-f9df28eba93b
plex_orchestrator.1.j2cqglzqo0i6@server02    | Forwarding work request to 6d099096-c2c4-4456-90da-9ec77316d8ab|plex-worker-server03
plex_orchestrator.1.j2cqglzqo0i6@server02    | Received update for task 3e33a397-ab9e-4728-9260-f9df28eba93b, status: received
plex_orchestrator.1.j2cqglzqo0i6@server02    | Received update for task 3e33a397-ab9e-4728-9260-f9df28eba93b, status: inprogress
plex_orchestrator.1.j2cqglzqo0i6@server02    | Client disconnected: UAacu1rC39oBo0JRAAAJ
plex_orchestrator.1.j2cqglzqo0i6@server02    | Removing job-poster 9f11430a-852d-4710-b6d3-090d7eb64546|3d4396666091 from pool
plex_orchestrator.1.j2cqglzqo0i6@server02    | Killing job b0d28a73-dfe1-4bce-8b78-d53c0f3be321
plex_orchestrator.1.j2cqglzqo0i6@server02    | Telling worker a106d06b-f6c5-44c5-8e95-6ecd30c4aa7e|plex-worker-server01 to kill task a23b4633-34ad-4007-b2a0-ef04802adc67
plex_orchestrator.1.j2cqglzqo0i6@server02    | Job b0d28a73-dfe1-4bce-8b78-d53c0f3be321 killed
plex_orchestrator.1.j2cqglzqo0i6@server02    | Received update for task a23b4633-34ad-4007-b2a0-ef04802adc67, status: done
plex_orchestrator.1.j2cqglzqo0i6@server02    | Discarding task update for a23b4633-34ad-4007-b2a0-ef04802adc67
plex_orchestrator.1.j2cqglzqo0i6@server02    | Received update for task 3e33a397-ab9e-4728-9260-f9df28eba93b, status: done
plex_orchestrator.1.j2cqglzqo0i6@server02    | Task 3e33a397-ab9e-4728-9260-f9df28eba93b complete, result: false
plex_orchestrator.1.j2cqglzqo0i6@server02    | Task 3e33a397-ab9e-4728-9260-f9df28eba93b complete
plex_orchestrator.1.j2cqglzqo0i6@server02    | Job 496dcab9-60e0-473a-aed3-1d8b26a3fcdb complete, tasks: 1, result: false
plex_orchestrator.1.j2cqglzqo0i6@server02    | JobPoster notified
plex_orchestrator.1.j2cqglzqo0i6@server02    | Removing job 496dcab9-60e0-473a-aed3-1d8b26a3fcdb
plex_orchestrator.1.j2cqglzqo0i6@server02    | Job 496dcab9-60e0-473a-aed3-1d8b26a3fcdb complete
plex_orchestrator.1.j2cqglzqo0i6@server02    | Client disconnected: cDnq_-Qfd_HFFhiiAAAL
plex_orchestrator.1.j2cqglzqo0i6@server02    | Removing job-poster 04e21e59-6039-422d-af98-0c388acec035|3d4396666091 from pool

@pabloromeo
Copy link
Owner

Hi!
Awesome that you picked up that code and gave it a shot.
Haven't looked at that stream splitting part in years lol.
Back then the motivation was that I was trying some crazy ideas like trying to distribute transcoding across an army of raspberry pi machines lol. Now with things like quicksync readily available on even low power celeron machines, there wasn't that much of a need for splitting one transcode, but rather for scaling horizontally for more simultaneous transcodes, but always handling the streaming task on a single worker at a time.

If I remember correctly the idea had a bit more merit in processes like Optimizing a movie, which could be done in parallel in segments and then stitched back together.
Not sure I ever got working behavior for live streaming though, since you really need the segments to be ready in order, and they'd all be writing to the same shared manifest which would probably overwrite each other. I'd really need to know waaay more about how ffmpeg works to pull something like that off.
But I'll happily try to help however I can if you want to take a stab at it.

Also it's been years, so maybe somebody has already created something opensource which does part of that we could leverage or use as a dependency. Haven't has a time to look into it.

One thing I never go to exploring was maybe splitting the audio transcoding from the video one, to be able to do them on separate nodes.

@FelixClements
Copy link

Hey!

Thank you so much for your detailed response! Your insights actually pointed me in the direction of hardware transcoding instead of stream splitting, which was immensely helpful.

I went ahead and looked into hardware transcoding solutions and finally got it working, though it was quite challenging to implement within Docker Swarm. I ended up spending quite some time configuring the right setup and managing resource constraints specific to hardware acceleration.

I'm really curious about your setup. How did you manage to get hardware transcoding working in your environment? Any tips or advice you could share would be greatly appreciated!

to get it to work i had to
deploy this stack https://github.com/allfro/device-mapping-manager/tree/master
add these volumes to the server and workers

      - /mnt/glusterfs/plex_server/Drivers:/config/Library/Application Support/Plex Media Server/Drivers
      - /mnt/glusterfs/plex_server/Cache:/config/Library/Application Support/Plex Media Server/Cache
      - /dev/dri/:/dev/dri/

@pabloromeo
Copy link
Owner

Yeah, hardware transcoding and Docker Swarm is quite a bit more challenging.
In my case I wasn't doing hardware transcoding when running on Swarm. However, I later migrated to kubernetes, and hardware transcoding was a bit easier. Since I'm using quicksync and not nvidia GPUs it required the install of the Intel Drivers on the cluster and managing the resource requests to have workers run on nodes with i915 devices IIRC.

@FelixClements
Copy link

Thanks for bringing up the topic of hardware transcoding with Docker Swarm and your experience with Kubernetes.

I actually faced some similar challenges with hardware transcoding as well. In my specific use case, I made some file changes in my fork of the project to prioritise Intel support.

If you're interested, feel free to check out my fork, where I’ve implemented these updates. You could maybe reference these changes or adapt them for your setup to possibly make the process easier for others wanting to get Intel hardware transcoding up and running in their environment.

Let me know if you need further details or clarification, I’d be happy to help!

Cheers

@Varashi
Copy link

Varashi commented Oct 12, 2024

Hey!

Thank you so much for your detailed response! Your insights actually pointed me in the direction of hardware transcoding instead of stream splitting, which was immensely helpful.

I went ahead and looked into hardware transcoding solutions and finally got it working, though it was quite challenging to implement within Docker Swarm. I ended up spending quite some time configuring the right setup and managing resource constraints specific to hardware acceleration.

I'm really curious about your setup. How did you manage to get hardware transcoding working in your environment? Any tips or advice you could share would be greatly appreciated!

to get it to work i had to deploy this stack https://github.com/allfro/device-mapping-manager/tree/master add these volumes to the server and workers

      - /mnt/glusterfs/plex_server/Drivers:/config/Library/Application Support/Plex Media Server/Drivers
      - /mnt/glusterfs/plex_server/Cache:/config/Library/Application Support/Plex Media Server/Cache
      - /dev/dri/:/dev/dri/

Hi Felix.

Can you give a bit more information as to which files to drop in the Drivers and Cache folders that you're mapping?
I'm specifically looking into getting an Intel Arc 310 working with clusterplex in kubernetes, but no luck up until now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants