-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-work how the notebooks repo is pulled #79
Comments
It should be possible to configure default path that gets displayed in the file browser first time user logs in, by generating default config for jupyterlab workspace: But I would not re-write it afterwards. |
I would like to see the repo in a subdirectory rather than the root of the user's home directory. Is there any reason not to? Motivation We often have new starters and visiting collaborators, and these new sandbox users have a huge learning curve: simultaneously learning git, unix and python. I think it is bad that the first thing we instruct them is to create a repo inside another repo, since this adds further inception-esque confusion, and is a widely discouraged git practice (i.e. git already interprets repo subdirs in a special way; it is asking for trouble, and complicates learning how git will behave). Already their beginner-mistakes routinely lead to trying fairly serious git kung fu (e.g. resetting to past states, rewriting objects out of histories, etc) to un-break everything and recover their work. I think we should instead encourage a best-practice workflow, that is as close as possible to what we want them to follow on other linux platforms like NCI (avoiding features specific to DEA-sandbox such as Preference I think we should have the repo pre-populated in Instruct all new users to create a different subdirectory, and there establish their own branch repo (but don't do it for them). I don't even think it is necessary to have jupyter pre-navigate to the examples directory; I see no harm in expecting first-time users to intuitively click "examples". (In fact I think this is better than needing to explain |
I'm in favour of having the notebooks loaded into an I think that having a "first start" README in the root/home folder is a good idea. And we already have some logic for a I think that we must force overwrite the folder and shouldn't copy the I wasn't involved in the decision to pull out the examples into folders in the root of the project, but I'm aware there was a decision there. |
I'd also be happy with a This readme would need to walk the user through in baby steps, even to the level of:
(The readme could also be the place to include the warning that files in As long as this was clearly explained and shown to the user at start up, I think it could serve as a nice way to familiarise the user with using the file browsing interface and launching notebooks for the first time. I'd be happy to work on the readme if we settled on this as an approach. |
I agree with a lot of the points made here, and particularly agree with Robbi's point that the user needs to have some clear guidance around how to use the sandbox and the As an additional thought, I've been using Amazon SageMaker recently, and they use a Jupyter Lab extension to manage their example notebooks. I think there are some upsides and downsides to this approach, but would be happy to discuss further with anyone that's interested. You can see a bit of a preview of how it works here: https://docs.aws.amazon.com/sagemaker/latest/dg/howitworks-nbexamples.html. One of the main benefits is that the examples are read-only, with a pop-up asking the user if they want to copy the file to their directory. |
I think a README is an excellent idea. I don't have a strong feeling as to whether we move the sandbox examples to the examples folder or leave them as the top dir - but either way I think we need a README so that users have a good idea of what's what. We might be able to create read-only notebooks without using SageMaker - https://coding-stream-of-consciousness.com/2018/11/12/read-only-protected-jupyter-notebooks/ - though it looks like it'll take a bit more effort. SageMaker looks interesting - do we know how the pricing compares to notebooks as they are on the sandbox? |
Hey @BexDunn -- sorry, my SageMaker link might have been a bit misleading. There's no need to use SageMaker specifically, it's just an example implementation of a Jupyter Lab extension that can handle a collection of example notebooks. The actual extension is here: https://github.com/danielballan/nbexamples |
Would another option be simply to symlink That would also be a place to administer |
I don't think we want to stop people from being able to write to it, @benjimin. It makes the notebooks do weird things if they're read only, I think. I like @caitlinadams' suggestion of using the |
Also, a potential security motivation is that any files which should not be committed to any git history still get stored somewhere inside the working directory of a repo (i.e. training users to invite mistakes in sensitive data management). |
Current Process:
On each server startup we clone the git repo on each to a temporary folder, remove some files (including .git) and copy the contents to the home directory of the notebook user with rsync.
This way of working ensures that there are no git merge conflicts causing issues with the user files
However it has some downsides:
Alternatives
The text was updated successfully, but these errors were encountered: