Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run node on host or single container #26

Closed
1 of 4 tasks
endersonmaia opened this issue Apr 11, 2023 · 18 comments
Closed
1 of 4 tasks

Run node on host or single container #26

endersonmaia opened this issue Apr 11, 2023 · 18 comments
Milestone

Comments

@endersonmaia
Copy link
Contributor

endersonmaia commented Apr 11, 2023

📚 Context

The current way the off-chain services that compose the cartesi-node solution are released is a container image for each service, and if you deploy each of these services as a separate container (Docker/Kubernetes) everything works just fine.

But when you need to run this directly on the host or inside a single container, you don't have a release available for that.

Why is this problem relevant?

Depending on the environment you need to deploy a cartesi-node, you may have restrictions on how to run multiple services and containers, and make the communication between them.

Although containers are standard, we still need to give support to those that don't use containers.

✔️ Solution

We could have a container image release with all the services together.

We could have binary releases without the container, so anyone can deploy this in a "plain old server" VM over a VPS or bare metal.

📈 Subtasks

@gligneul gligneul changed the title Enable running of services alongside each other on the host or within a single container Run node in a single container Apr 11, 2023
@gligneul gligneul changed the title Run node in a single container Run node in host or single container Apr 11, 2023
@gligneul gligneul changed the title Run node in host or single container Run node on host or single container Apr 11, 2023
@gligneul
Copy link
Contributor

@endersonmaia I have a few questions regarding the subtasks.

environment variables should be normalized so that they don't conflict between services

Is it better to add a unique prefix to each service? Or do we just need to make sure that the same variable has the same name across all services?

health-checks should be optional

I don't know if I agree with this. Couldn't you just assign port 0 for the health check so the system assigns a random port to it?

logs should be prefixed with the services

Do you have a guideline for the log format?

binary releases at least for linux/amd64;

Is this necessary for the task? Wouldn't be sufficient to release a docker image with all service binaries?

@endersonmaia
Copy link
Contributor Author

Is it better to add a unique prefix to each service? Or do we just need to make sure that the same variable has the same name across all services?

Good examples are SESSION_ID and REDIS_ENDPOINT, I think they could be the same for every service.

I don't know if I agree with this. Couldn't you just assign port 0 for the health check so the system assigns a random port to it?

I need to know the port to configure this form the outside (a kubernetes manifest, or docker-compose).

But maybe this falls into the same problem of prefixes, we need a INDEXER_HC_PORT and DISPATCHER_HC_PORT to avoid conflicts.

Is this necessary for the task? Wouldn't be sufficient to release a docker image with all service binaries?

This could be another issue, but it's great to be able to download binary releases directly from the GitHub Release page, if you need to run this yourself, without the container stuff.

@gligneul
Copy link
Contributor

This could be another issue, but it's great to be able to download binary releases directly from the GitHub Release page, if you need to run this yourself, without the container stuff.

The services are not meant to be used on their own since you need a lot of configuration, so I think is a very particular use case. We probably should create a separate issue to discuss that.

@endersonmaia
Copy link
Contributor Author

Another example:

dispatcher

  • TX_PROVIDER_HTTP_ENDPOINT

state-server

  • BH_HTTP_ENDPOINT
  • BH_WS_ENDPOINT

TX_PROVIDER_HTTP_ENDPOINT and BH_HTTP_ENDPOINT can have the same value.

@omidasadpour
Copy link
Contributor

omidasadpour commented Apr 20, 2023

@endersonmaia Is this a good idea to have all of the services in just one docker image ?
We will have lots of challenges like :

1 - How should they work together ?

2 - What if one of the services crashed ? should we restart the whole container or we can just restart that specific service ? maybe we need to use a process management tool like Supervisord

3 - What if one of our services may fork into multiple processes ? (for example, Apache web server starts multiple worker processes)

4 - What about our services dependencies like Postgres and Redis ? should we deploy them inside the same container or we need to deploy them in different containers ? (If we deploy them in different containers, then the user should manage the networking stuffs in docker again )

@endersonmaia
Copy link
Contributor Author

@endersonmaia Is this a good idea to have all of the services in just one docker image ? We will have lots of challenges like :

There's noting bad about it. :)

1 - How should they work together ?

They should work the same.

2 - What if one of the services crashed ? should we restart the whole container or we can just restart that specific service ? maybe we need to use a process management tool like Supervisord

Each service should be resilient enough and not depend on an external supervision/orchestration. So it's each service responsibility to retry on failing connections and retry with some retry/backoff/timeout logic until finally fail/exit. With good logs explaining the reason for the failure. Also the supervisor/orchestrator should have its own configuration of retry/backoff/timeouts.

3 - What if one of our services may fork into multiple processes ? (for example, Apache web server starts multiple worker processes)

Our services already do that, there's no issue here.

4 - What about our services dependencies like Postgres and Redis ? should we deploy them inside the same container or we need to deploy them in different containers ? (If we deploy them in different containers, then the user should manage the networking stuffs in docker again )

This dependencies can be managed by the supervisor/scheduler used.

I'm experimenting with s6-overlays for the single container approach, someone could try systemd if they need to, and solve this there.


We're not going do stop releasing container images for each service like we do now, we're only going to have other options, a single-container being one of that options.

@gligneul gligneul self-assigned this Apr 20, 2023
@endersonmaia
Copy link
Contributor Author

endersonmaia commented Apr 27, 2023

@gligneul it just occurred to me that all services will see the RUST_LOG configuration, but what if I want to control the log level for just a single service via environment variables?

One suggestion would be to have RUST_LOG by default, but that would be overwritten by $<service>_RUST_LOG or $<service>_LOG_LEVEL.

So, if I want to define the log level globally, I could use RUST_LOG or LOG_LEVEL or CARTESI_NODE_LOG_LEVEL.

In case I want to define an specific service, I could use DISPATCHER_LOG_LEVEL or CARTESI_NODE_DISPATCHER_LOG_LEVEL.

This comment could be transformed into an issue if you want.

@gligneul
Copy link
Contributor

@endersonmaia RUST_LOG is a variable from Rust, I'm not sure if we can change it. And even if we can change it, I'm not sure if we should.

You can already set the log for specific services by specifying the Given Rust module. For instance: RUST_LOG="dispatcher=trace,advance_runner=trace", and so on.

@endersonmaia
Copy link
Contributor Author

Yeah, that's why I suggest exposing LOG_LEVEL instead of RUST_LOG, and dealing with this internally.

Nice that I can define the service in RUST_LOG, didn't know that, even so, imagine that I want to define different log levels for different services, putting all this in a single RUST_LOG don't feel great.

@gligneul
Copy link
Contributor

I agree that info should be the default. That should be easy to configure.

imagine that I want to define different log levels for different services, putting all this in a single RUST_LOG don't feel great.

That is not much different from specifying multiple variables. You can still set the default one in RUST_LOG. RUST_LOG="...,info".

@endersonmaia
Copy link
Contributor Author

That is not much different from specifying multiple variables.

I disagree.

I prefer the explicit:

DISPATCHER_LOG_LEVEL="trace"
ADVANCE_RUNNER_LOG_LEVEL="trace"
INDEXER_LOG_LEVEL="info"
GRAPHQL_SERVER_LOG_LEVEL="warn"

Than reading this:

RUST_LOG="dispatcher=trace,advance_runner=trace,indexer=info,graphql-server=warn"

Maybe it's a matter of taste, IDK.

@tuler
Copy link
Member

tuler commented Apr 27, 2023

It depends on who is the user.
If he is the hardcore user, infrastructure manager, cloud provider, etc, I think it’s ok to be RUST_LOG

For the application developer it will be something very simple like

sunodo run
sunodo run --verbose

and we will decide what to do.

@gligneul
Copy link
Contributor

Maybe it's a matter of taste, IDK.

Yes, it looks better but we would have to implement this logic by hand. The RUST_LOG already works out of the box and provides the functionality that we need, even though it looks a bit ugly.

@gligneul
Copy link
Contributor

gligneul commented Jul 4, 2023

We merged the health check improvement to allow the configuration of multiple services. @endersonmaia, is there anything else that you need to be prioritized on our side?

@gligneul gligneul added this to the 1.1.0 milestone Jul 4, 2023
@gligneul gligneul removed their assignment Jul 4, 2023
@endersonmaia
Copy link
Contributor Author

@gligneul nothing that I can think of right now.

I'll test these new health-check options at sunodo/rollups-node.

@omidasadpour
Copy link
Contributor

We merged the health check improvement to allow the configuration of multiple services. @endersonmaia, is there anything else that you need to be prioritized on our side?

@gligneul We could have gracefull shutdown (preStop Hook) config too.

This can help us to Manage service Lifecycle better than now .
/cc @endersonmaia

@gligneul
Copy link
Contributor

gligneul commented Jul 4, 2023

@gligneul nothing that I can think of right now.
I'll test these new health-check options at sunodo/rollups-node.

Ok, thanks!

@gligneul We could have gracefull shutdown (preStop Hook) config too.

Graceful shutdown is important. I will create an issue for it.

@gligneul gligneul modified the milestone: 1.1.0 Jul 11, 2023
@gligneul gligneul self-assigned this Jul 14, 2023
@gligneul gligneul transferred this issue from cartesi/rollups Jul 27, 2023
@gligneul gligneul removed their assignment Aug 21, 2023
@torives torives mentioned this issue Sep 8, 2023
19 tasks
@torives
Copy link
Contributor

torives commented Sep 8, 2023

This issue has spawned a lot of interesting discussions and issues, but it has become quite confusing. I'll be closing it now, but I've created #80 to tackle its original proposal.

@torives torives closed this as completed Sep 8, 2023
@github-project-automation github-project-automation bot moved this from 📋 Backlog to ✅ Done in Node Unit Sep 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

5 participants