Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hosting production #84

Closed
wlandau opened this issue Sep 10, 2024 · 15 comments
Closed

Hosting production #84

wlandau opened this issue Sep 10, 2024 · 15 comments
Assignees
Labels

Comments

@wlandau
Copy link
Member

wlandau commented Sep 10, 2024

The p3m maintainers said p3m mirrors and snapshots an existing CRAN-like repository, which means we need to figure out how to host the current snapshot ourselves.

@jeroen, would it be feasible to host a third production universe for this, maybe one with sources only and/or no checks?

If not, @shikokuchuo, would it be okay to consider Netlify as a temporary solution to get things going? I got the sense that Posit might be open to more than just a mirror when things really take off, but they need something else in the initial phase just to get started.

@shikokuchuo
Copy link
Member

If as per the discussion, p3m only needs access to a CRAN-like repo for source files as they prefer to use their own build tools, then we can easily deploy using 'types=src' into a GitHub repo. I don't foresee any of the individual source files exceeding the 100MB limit.

I think this is the safest way to get started, unless @jeroen does actually want p3m to use the R-universe binaries, in which case a production universe seems to be the way forward.

Either way would be fine with me.

With a third party site, if they give us notice they're going to bill us or take us offline, I do not really have a good strategy for that eventuality. It has happened before to open source projects.

@jeroen
Copy link
Member

jeroen commented Sep 10, 2024

I think we can cross that bridge when we get there, but wouldn't it make sense for p3m to mirror from the multiverse staging repository, at the same moment we create the snapshot ourselves?

IIUC the purpose of staging is to organize a stable repository, ready to be archived or mirrored onto a production site at the given date? So the p3m snapshot of staging at a given date should be identical to our own production snapshots? Otherwise we effectively have 4 stages: Community -> staging -> production -> p3m archives of production repo?

We can host snapshots on a selfhosted http server if you don't want to use a public service like netlify or github-pages. I don't think it make sense to create a third universe only for hosting files, because this means we also restart all the checks and builds, and at that point it is no longer a snapshot but a new repository...

@shikokuchuo
Copy link
Member

I think we can cross that bridge when we get there, but wouldn't it make sense for p3m to mirror from the multiverse staging repository, at the same moment we create the snapshot ourselves?

That's a really good idea as Staging is already set up as a CRAN-like repo!

We can host snapshots on a selfhosted http server if you don't want to use a public service like netlify or github-pages. I don't think it make sense to create a third universe only for hosting files, because this means we also restart all the checks and builds, and at that point it is no longer a snapshot but a new repository...

Agree, the snapshot just needs static hosting, not a whole universe-type repo. I am not against using GitHub pages as if that becomes unavailable then I think we all have bigger problems to deal with.

At the end of the day, I think we'd want to point production.r-multiverse.org to the p3m location which would integrate our snapshot with the p3m dependencies, so users downloading from that repo can be guaranteed to get snapshot versions of all packages. Do we need / want to host our own snapshots (as a CRAN-like server) at all in that case?

@shikokuchuo
Copy link
Member

shikokuchuo commented Sep 13, 2024

I've proceeded to test hosting a snapshot using Github Pages at https://github.com/r-multiverse/snapshot and it seems to work fine.

The workflow (with manual trigger) just modifies @jeroen's own https://github.com/jeroen/backup/blob/main/.github/workflows/mirror.yml to use the URL created by our staging workflow. I think GitHub recognises it as an internal address and the action finishes super fast.

Try:

available.packages(repos = "http://r-multiverse.org/snapshot/")

I can see this perhaps becoming an issue when we scale up and the size gets larger, but in that case I think we can limit hosting to just the source files and rely on p3m for instance to provide binaries.

I think we can use this to at least get started with our initial quarterly Production release.

@shikokuchuo shikokuchuo self-assigned this Sep 13, 2024
@wlandau
Copy link
Member Author

wlandau commented Sep 23, 2024

This is all excellent!

When we scale up, will the source packages alone create a size issue in https://github.com/r-multiverse/snapshot? At some point after the official rollout of R-multiverse, we might ask the p3m folks if they could host snapshots without a mirror. They seemed open to it if/when R-multiverse is successful, just not right now at this early development stage.

Is the plan to remove binaries from https://github.com/r-multiverse/snapshot when the p3m integration is in place?

@shikokuchuo
Copy link
Member

When we scale up, will the source packages alone create a size issue in https://github.com/r-multiverse/snapshot?

GitHub supposedly blocks individual files above 100MB. Should not be a problem for source files.

Is the plan to remove binaries from https://github.com/r-multiverse/snapshot when the p3m integration is in place?

Yes. The 'snapshot' repo was for testing really, we can have it deploy to production.r-multiverse.org when we cut our first release. Subsequently that may point to p3m.

@shikokuchuo
Copy link
Member

FYI I've set up https://snapshot.r-multiverse.org/ to forward to https://r-multiverse.org/snapshot/, and I've tested that this subdomain now works with install.packages(). This is for the 'sample' repository.

We can do the same when we go live with Production to point https://production.r-multiverse.org/ to the right place.

@shikokuchuo
Copy link
Member

The production GitHub repo has been prepared and is serving https://r-multiverse.org/production.
https://production.r-multiverse.org/ also points to https://r-multiverse.org/production.

So theoretically we just need to activate the workflow trigger and the Production snapshot will be deployed. It's good that we have 'snapshot' set up as well as we can use that as a test run.

@eitsupi
Copy link
Member

eitsupi commented Oct 8, 2024

That reminds me, as a side note, I think it is inappropriate to make the license to MIT license because the snapshot repository contains source code and binaries of various packages.
https://github.com/r-multiverse/snapshot

@shikokuchuo
Copy link
Member

That reminds me, as a side note, I think it is inappropriate to make the license to MIT license because the snapshot repository contains source code and binaries of various packages. https://github.com/r-multiverse/snapshot

Yes that's a good point. I'll remove those now.

@shikokuchuo
Copy link
Member

I'm going to close this as I think we are good for our first quarterly release. Ongoing discussions involving p3m etc. are going to be better served by new, more focused issues.

@wlandau
Copy link
Member Author

wlandau commented Oct 11, 2024

That reminds me, as a side note, I think it is inappropriate to make the license to MIT license because the snapshot repository contains source code and binaries of various packages.

I agree.

We still need to license individual code files for the infrastructure though, and for the packages, it would help to explicitly put ownership and responsibility on the original authors. I posted a PR at r-multiverse/snapshot#2.

@wlandau
Copy link
Member Author

wlandau commented Oct 11, 2024

Also, what is the difference between between https://github.com/r-multiverse/snapshot and https://github.com/r-multiverse/production? Is the former for the p3m folks to use, while the latter is our current hosting mechanism before we integrate with p3m?

@shikokuchuo
Copy link
Member

Snapshot is just a test as I didn’t want to put anything at the production url while it’s not officially released. But now we have it, we can use it as a test run. We can remove mention of it on the website once we’ve released production.

@wlandau
Copy link
Member Author

wlandau commented Oct 21, 2024

Community and staging will also need testing, c.f. #74. At some point before we officially go live, I hope we can replicate the deployed R-multiverse infrastructure in a scaled-down dev space outside the main R-multiverse org.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants