Cayars - Setup walk through and some tips and tricks

@cayars
 
How does your conversion script handles sub and audio that are tagged as unknown instead of english?


It will default to English (configurable) for them. So basically if it doesn't know for sure it's a language you don't want then it will keep it.

Hope that helps,
Carlo

Fair enough. I also don't have nearly the users you do and my system has more downtime so I'm looking at it from an energy usage and cost savings perspective even when the system is on. I haven't yet built a dedicated Plex box, my year of Plex has been upgrading my in-home network and building a NAS. If you actually build a viable way of doing distributed transcoding it'll likely influence my Plex server build. Looking forward to playing with what you come up with.
 
I do think I could move my Plex install to my unRaid box and then have it wake my desktop up if/when you add WOL functionality. That alone would be hugely beneficial. So there's probably a whole pool of use cases for remote transcoding besides the whole distributed part.


For sure. However, keep in mind that the initial release will be Windows only. I started this in C# because Emby was in C# and because it's easy to work on in Visual Studio. However it should work in MONO (just like Emby) or I could re-write it in a different language for other platforms once the logic in the C# versions is solid. I'll cross that point when we get to it.

Carlo

I'm flexible. Latest versions of unRaid have a VM manager built in so I'll throw some form of Windows in one of those to start. That's my current plan once you have something to share. Really looking forward to this.

Also looking forward to this...currently running two servers because of very infrequent load issues...having distributed would really help that.  Thanks cayars

All of this talk about distributed transcoding is interesting but I am a little (OK, more than a little) confused.  Is there a forum post or support page that talks more about this.  I have my whole system on a very beefy system but there are other computers on my home LAN.  Could I add them to a "CPU Pool" where when I need their assistance I could use it?  All of the data is on my main PC so probably not, just trying to understand how others are doing this on a more rudimentary level.

All of this talk about distributed transcoding is interesting but I am a little (OK, more than a little) confused.  Is there a forum post or support page that talks more about this.  I have my whole system on a very beefy system but there are other computers on my home LAN.  Could I add them to a "CPU Pool" where when I need their assistance I could use it?  All of the data is on my main PC so probably not, just trying to understand how others are doing this on a more rudimentary level.

It is only Cayars prototype...no one else has this nor is it discussed anywhere else

I know that it is a Carlos only type of thing but folks are talking about how they would use it and so I am just wondering what the system requirements would be.  Obviously more than one computer, but does the media folders have to be on multiple or is the library DB on one but the media is on another, etc.

I know that it is a Carlos only type of thing but folks are talking about how they would use it and so I am just wondering what the system requirements would be.  Obviously more than one computer, but does the media folders have to be on multiple or is the library DB on one but the media is on another, etc.

Well from his explanation, all this does is take the ffmpeg and offload it to another ffmpeg running on another machine...so no touching of your media folders or library. im guessing the transcode file will just reside on the other machine and plex will think it is in the transcode folder. Or it could output the file back to the transcode folder but this would create alot of overhead.

I don't think it would necessarily add that much overhead. If you have a decent wired network setup you're just writing across to another disk vs. a local disk. Once the transcoder get a bit of buffer you'd likely be fine.

Carlo could you give a more technical explanation of the workings of your prototype/where you see this going? I'd be interested and I'm sure others would be as well. 

Correct me if I'm wrong...

Currently transcoding takes place in chunks.  The transcoder engine is sent a chunk which it converts and drops into the specified folder, which Plex then prepares to stream and the transcoder engine is sent another chunk.  These chunks are sent sequentially, from the start of the movie to the end of it, in order.  This means they are processed in order.

If the above group of statements are correct, Carlo has a few things to over-come that may put a monkey wrench in things...  If one of the transcoding devices is faster than the others, and it is sent a job which it completes, while the others are still working on their first job(s), it could get sent it second chunk which may even be done before the first chunks by the other devices.  Now this chunk is out of order.  How would Plex deal with this "out of sequence" chunk?

This is as much about timing as it is about the speed of the CPU's involved.  It's also as much about how Plex handles the chunks, too. 

I think what happens is 1 transcode job gets 1 machine. Since each transcode job has a unique ID the proxy could determine which box it needs to go to. Obviously Carlo will need to confirm, but that's my understanding of things.

If that's the case, then a single movie isn't broken up, so one device is going to do all the transcoding of that movie.  Timing is still somewhat of an issue, as you need to do this transcoding so starting the movie doesn't timeout from the client.

With 3-4 devices transcoding a movie at a time, getting the whole thing done is going to be fast!  This is what I was thinking the whole distributed transcoding was going to be about as well as taking the load off the PMS CPU.

That would actually be pretty sweet. I bet once you get past a few chunks you wouldn't really have issues with them being out of order. It'll be interesting to see how this shakes out.

Either way, I can see this being of huge benefit to someone with a NAS with 2 NICs...  One is active on the WAN side of the network, and is able to send/receive Plex requests on that line.  The other NIC could be on a "Transcode" network.  Small switch (5 or 8 port) with a couple of NUCs on this switch.  Transcode requests go to these devices through that switch and not through the main WAN available network.

With 1GB switches and NICs on all the devices on the transcode side, moving data back and forth is toot sweet FAST! especially if they are the only devices on that network.  Could put the whole thing in a self-contained enclosure, switch and NUCs, with a small UPS for power and a fan or 2 in it.  Run power cable and Cat 5e/6 to the NAS and the transcode box is GTG, completely self-contained...

At that point it's just configuration and time to watch some movies....  :)

So Carlos would the best approach for figuring out available transcodes per distributed server be to put in the passmark scores for each listed device somewhere like a list/database? Then let the main agent figure out what resolution/quality level each passed transcode stream is. Like a 720p stream is 1500 passmark vs 1080p being 2000 (or whatever testing shows) and it keeps a running tab vs available passmark for active transcoding clients. once it hits needing more passmark it sends a wol packet to next resource to bring it online. After needs drop down a sleep command is sent?

Correct me if I'm wrong...
 
Currently transcoding takes place in chunks.  The transcoder engine is sent a chunk which it converts and drops into the specified folder, which Plex then prepares to stream and the transcoder engine is sent another chunk.  These chunks are sent sequentially, from the start of the movie to the end of it, in order.  This means they are processed in order.
 
If the above group of statements are correct, Carlo has a few things to over-come that may put a monkey wrench in things...  If one of the transcoding devices is faster than the others, and it is sent a job which it completes, while the others are still working on their first job(s), it could get sent it second chunk which may even be done before the first chunks by the other devices.  Now this chunk is out of order.  How would Plex deal with this "out of sequence" chunk?
 
This is as much about timing as it is about the speed of the CPU's involved.  It's also as much about how Plex handles the chunks, too.

Plex does this a bit different than Emby and this is going to be where I may have to rework things a bit.
 

I think what happens is 1 transcode job gets 1 machine. Since each transcode job has a unique ID the proxy could determine which box it needs to go to. Obviously Carlo will need to confirm, but that's my understanding of things.

Could possibly have to track transcodes and complete this all on the same machine. Until I actually get back into working on this (after work project) I won't know yet.
 

So Carlos would the best approach for figuring out available transcodes per distributed server be to put in the passmark scores for each listed device somewhere like a list/database? Then let the main agent figure out what resolution/quality level each passed transcode stream is. Like a 720p stream is 1500 passmark vs 1080p being 2000 (or whatever testing shows) and it keeps a running tab vs available passmark for active transcoding clients. once it hits needing more passmark it sends a wol packet to next resource to bring it online. After needs drop down a sleep command is sent?


This is something I've thought about. But to be honest I'm not sure it's going to be needed. For example if it works in "chunks" and these chunks can get spread around "round-robin" style then this will pretty much be moot. In that case it might be as simple as just using the computer with the least CPU use at present. If on the other hand for whatever reason a particular transcode needs to stay pinned to a particular computer then this will be much more important.

These are all GOOD questions which I've already thought about and know to test. These things don't really worry me much at all as I'll just work through them. What I'll find more challenging is taking into consideration how each person may want to use this. Examples:

Admin 1: Wants to run all transcodes on Plex server until it hits say 80% CPU then use client #2 until it hits 90% CPU use then use client #3
Admin 2: Wants to keep Plex CPU as low as possible and wants to do all transcoding first on client #2 then fall back to clinet #1 (Plex Server)
Admin 3: Wants to run transcodes on client #1 (Plex server) until it breaks 50% use then WOL client 2 and use it until 80% then WOL client #3 and use.

What happens if WOL doesn't work or client isn't available. How to reallocate transcode sessions?

So to me getting the "basics" working and figuring out if "chunking" will allow different parts to run on different computer will be the easy part but just figuring out WHICH client to use will be the harder part due to everyone wanting a different configuration.

Make sense?

I like it. You've obviously spent a good deal of time thinking this through. I can't wait to see some actual prototype builds we can try.

Also

Admin 4: Wants to run Plex on a low power device and WOL the actual transcoding server when someone wants to play something.

EDIT: Maybe you should start a separate thread for discussing this project too.

I may when I start to release stuff. For now it’s just chit chat. :slight_smile:

Distributed Encoding is the correct answer.

Like I said, I have this working in a prototype mode for Emby which directly uses ffmpeg. Plex uses a modified version of ffmpeg with a different name but should roughly work the same.
In a nutshell it will work similar to Emby and this is the "setup" on that platform at present: To use this presently you would just rename ffmpeg.exe to ffmpegOriginal.exe (or similar) and then drop in a replacement ffmpeg.exe I've created. The drop in replacement is basically just a "proxy" that takes the command line options sent to it and hands them off to ffmpeg.exe running on another computer (or the server). It watches stderr, stdin, stdout and shuffles these messages to and from the actual ffmpeg and the drop in Proxy. Plex should not even be aware it's not "talking" to the real transcoder.

As long as the 2nd, 3rd computer have the same drive mapping as the server to the transcode directory and the media itself everything seems to work. You'll of course also have to copy the Plex transcoder over to these additional machines from the server.

All of the above presently work (for Emby). Should just need to change names to use the Plex Transcode EXE and it should roughly work (might need a tweak here or there). What I need to do to go from prototype to real working edition is to figure out how I want to monitor and figure out which computer to run the transcode on. This could be as simple as monitoring the CPU use on each computer and using the lowest.

Where it will get interesting is to be able to wake up computers on the LAN via WOL functionality to dynamically bring online additional transcoders when needed.

For Plex this should work for real-time streaming, cloud sync, device sync, Bif/index file creation or any other time the transcoder is called.

I like this approach to scaling out more so then trying to run multiple Plex servers as there is only one DB, one set of meta-data and no need to "sync" data among the servers.

What do you guys think?

Carlo

PS at present it's only "proto-typed" for Windows machines but i don't see why I couldn't do this for Linux and other platforms also. Could be useful for those running Plex on a NAS as they could wake a computer on the LAN to do the transcoding.

If you have any plans to open source this, I'd be happy to lend my .net development skills in what ever free time I could spare.

Throwing out some lateral thinking. 

Would there be any benifit to having cayars transcoders run in a docker container? That way you could use a lot of resources that already exist for docker with automatically spinning up containers and stopping them whenever needed.

Also, cayars, random question, have you thought of or played around with HTTP/2 at all? I was wondering if it would be possible to put plex behind a HTTP/2 reverse proxy and potentially get some of the optimizations that it provides.