Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate JMTE hub into our current infrastructure #2201

Closed
18 of 22 tasks
yuvipanda opened this issue Feb 14, 2023 · 16 comments
Closed
18 of 22 tasks

Integrate JMTE hub into our current infrastructure #2201

yuvipanda opened this issue Feb 14, 2023 · 16 comments

Comments

@yuvipanda
Copy link
Member

yuvipanda commented Feb 14, 2023

The Jupyter Meets the Earth grant is what is currently funding my time, and I'm going to work with @fperez to bring #436 into the 2i2c infrastructure. The current hub + that PR was the amazing work of @consideRatio, blazing several trails that have led our current infrastructure to be where it is at. This issue tracks the various bits needed to integrate that hub into our infrastructure as a regular hub so it can be supported via our regular means.

TODO

@yuvipanda yuvipanda assigned yuvipanda and unassigned yuvipanda Feb 14, 2023
@damianavila
Copy link
Contributor

@yuvipanda, do we have a deadline for this one to happen? Just trying to figure it out a timeline...

@consideRatio
Copy link
Member

consideRatio commented Feb 15, 2023

Add SFTP service to basehub, as that service is used to bring data in and out of home directories. JupyterHub SSH itself doesn't seem to be used much, so we can probably ignore that one.

If there is agreement that we should support this, then this is what I suggest as a solution to this ticket for the LIS basehub in the shared 2i2c cluster.

Does that seem reasonable? If so, I suggest we extract that out as a dedicated issue as that work item is large enough on its own and should be easy to self-contain I think.

@consideRatio
Copy link
Member

I've opened 2i2c-org/features#20 with this

@yuvipanda
Copy link
Member Author

yuvipanda commented Mar 28, 2023

I think for the first quarter, what we'd like to accomplish is:

  • Setup a new hub based on our traditional usual AWS tune-up, nothing specific. Let's ignore terraform import, just do a full new setup
  • Copy user files and any storage buckets needed to be here.
  • Move the image building to new repo with repo2docker-action

Things that will be stretch goals:

  • SFTP setup

Things we are not going to do is to try to import current resources. Maybe EFS is still worth it, otherwise let's not try too much.

@yuvipanda
Copy link
Member Author

I could import the existing EFS with terraform import -var-file=projects/jmte.tfvars aws_efs_file_system.homedirs fs-01707b06

yuvipanda added a commit to yuvipanda/pilot-hubs that referenced this issue Apr 10, 2023
- Use 'jupyter-meets-the-earth' rather than jmte as name,
  because the existing cluster is already called 'jmte'.
- SFTP service is gone!
- Replicates config from
https://github.com/2i2c-org/infrastructure/pull/436/files
  to the extent possible
- Uses our IRSA config for AWS permissions, rather than the
  eksctl created service account in use earlier.
- Uses CILogon+GitHub for authentication, rather than auth0+github
- Re-use the same EFS filesystem from before, avoiding the need to
  copy a few terabytes of data around
- Hub is now at jmte.2i2c.cloud, and the old URL
  (hub.jupyterearth.org) redirects here. Same for staging.

Ref 2i2c-org#2201
@yuvipanda
Copy link
Member Author

#2474 moves most things over, and users are already back on this!

@yuvipanda
Copy link
Member Author

Checked with @consideRatio and cleaned up remnants of an old magic castle installation

@yuvipanda
Copy link
Member Author

I've created new accounts for everyone else. I think @consideRatio already has access, and so does @damianavila. Let me know if that is not the case.

@damianavila
Copy link
Contributor

and so does @damianavila. Let me know if that is not the case.

I do not think I have access (no internal references), can you resend the credentials for my existing user? Thanks!

@yuvipanda
Copy link
Member Author

@damianavila reset and sent.

@yuvipanda
Copy link
Member Author

I think for this quarter, what's left to do is to move the image building process to using repo2docker-action. Then this can be considered done for this quarter

@damianavila
Copy link
Contributor

I think for this quarter, what's left to do is to move the image building process to using repo2docker-action.

Created #2512 for that one!

@yuvipanda
Copy link
Member Author

I have now asked the JMTE folks to direct support through our 2i2c support process, rather than by pinging me or Erik individually on their slack!

@consideRatio
Copy link
Member

@yuvipanda do you have thoughts on what level of support are 2i2c to provide jmte as part of this?

Specifically I'm considering if we are to help with user image updates (here). I think that may be an expectation currently from jmte when transitioning to 2i2c support.

@yuvipanda
Copy link
Member Author

@consideRatio that is the current expectation yes. I do want us to move that image to using repo2docker-action and standardize along with the rest of the other images we maintain. We should also perhaps set expectation that image updates might be slower until that is done?

yuvipanda added a commit to yuvipanda/pilot-hubs that referenced this issue Jun 20, 2023
I think overall, we want to reduce the number of images we maintain
for our end users. A big part of this to use upstream images directly
wherever possible, and allow users to choose. This helps us benefit
from upstream fixes as quickly as possible, and reduces the total
amount of work done. For example, instead of specifically bumping
the version of Julia just for this one
image (pangeo-data/jupyter-earth#166),
we could instead do that upstream and benefit
everyone (jupyter/docker-stacks#1917).

Faster startup times is another benefit, as the more specific
images are smaller than a big 'all-in-one' image.

There are some features of the all-in-one image that currently don't
easily exist upstream:

- Linux desktop
- Nix
- Specific extra packages that maybe installed

We can figure these out over time, but not maintaining the all-in-one
image is a nice goal to shoot for.

Ref 2i2c-org#2201
yuvipanda added a commit to yuvipanda/pilot-hubs that referenced this issue Jun 20, 2023
I think overall, we want to reduce the number of images we maintain
for our end users. A big part of this to use upstream images directly
wherever possible, and allow users to choose. This helps us benefit
from upstream fixes as quickly as possible, and reduces the total
amount of work done. For example, instead of specifically bumping
the version of Julia just for this one
image (pangeo-data/jupyter-earth#166),
we could instead do that upstream and benefit
everyone (jupyter/docker-stacks#1917).

Faster startup times is another benefit, as the more specific
images are smaller than a big 'all-in-one' image.

There are some features of the all-in-one image that currently don't
easily exist upstream:

- Linux desktop
- Nix
- Specific extra packages that maybe installed

We can figure these out over time, but not maintaining the all-in-one
image is a nice goal to shoot for. To this end, the JMTE image is
still the default, but marked as 'deprecated' as I don't want to
continue doing a lot of maintenance on it.

Ref 2i2c-org#2201
@yuvipanda yuvipanda removed their assignment Jul 27, 2023
@consideRatio
Copy link
Member

I think this is sufficiently completed and no longer useful issue to retain open - closing

@consideRatio consideRatio removed their assignment Jun 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Complete
Status: Active goal
Development

No branches or pull requests

3 participants