Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore the possibility to decouple eden's server 2 main subsystems: api and worker. #4

Open
one1zero1one opened this issue Sep 23, 2021 · 0 comments
Assignees

Comments

@one1zero1one
Copy link
Contributor

one1zero1one commented Sep 23, 2021

The problem we have now on eden server is that functions that add to queue and those that consume from queue are called from within the same process => scales (by design) only within the thread count and GPU count.

Scaling it further means duplicating n times all components. While this is in theory possible (as most things are in the cloud), dealing with n individual queues on m physical servers for async jobs becomes a problem in itself, one that should not exist in the first place (ironically, one could solve this problem with ... a queue). Irony is not lost on us.

I'll use this issue to explore what would it take to change for scale - while keeping the abstractions and concepts you guys defined with eden intact -

Each unit within eden is called a BaseBlock, they're the units which take certain inputs and generate art accordingly.

Fine, so this is sort of a code template for the worker subsystem. For all intents and purposes run() takes text, numbers, images, gpu as input, does the thing, and can output text, number, images. This gets neatly defined in code.

(A side note: currently BaseBlock is tightly coupled with hosting a block. Is what actually made it very hard for me to wrap my head around first time I read it, so let me vent my inner artist frustration a little bit: if the hosting fails in unknown ways, the block in itself is useless. Running a block without hosting (connecting the client to a block, for example) would give people that are going to obviously break their neck trying to host - a way to run their eden compatible blocks locally. Give some love to eden.datatypes on a shorter feedback loop than hosting)

Back to our 🐑 what do I mean by decouple the hosting code. It's as trivial as running the same code twice in different config. For example:

The following app.py Python file implements both the frontend web server and the worker; they just need to be started with the flask and celery CLIs respectively.

(thrid paragraf from here)

Decoupling would just in effect mean that I could run the same code server in two configurations, one that logically only runs the API + celery functions that puts stuff to redis, and one that only does the celery functions that take from redis and runs the blocks.

So, in effect running host_block as-is would do what ti does now. Load up the block definition, spin up API, spin up celery that would use redis and local disk and do the whole dance.

Running host_block with a serve parameter would limit itself to serving and adding to queue. Running it with a work parameter would just look in queue, and run next block. The beauty of it, is that the queue allows us to run more instances ofhost_block in run mode.

If you run it coupled, you can control by threads how many you run in parallel.

If you run it decoupled, it's the orchestrator's job to decide (or we can write a small dispatcher if we want to run this on something that's not kuberentes or nomad or docker swarm etc)

Hope that makes sense, I look forward to hear the feedback.

One more idea, which bugs me still. The dependence of a block to hosting is really bad for business. And by business I mean being an artist that doesn't want to learn networking.

Maybe a silly idea, but hosting a block without network could simply mean dump the python block to a file. And running the client without network could just run the file from the disk. All nicely abstracted, so if I really wanna play with eden-clip on my local gpu while using the eden concepts.

I would even go one step further to say you would have 3 hosting levels.
One locally, file, second hosting locally, but with the current queue all-in-one system, and a third one abraham cloud :) that would somehow ship your block to some kind of a sandbox in our kubernetes, try to build and run on the fly. But that's for another time.

For now, I just am curious what you guys think about the splitting of this particular atom :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants