-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update README.md #134
base: main
Are you sure you want to change the base?
Update README.md #134
Conversation
Co-authored-by: Andrew Kiss <[email protected]>
@@ -94,6 +76,7 @@ to | |||
```yaml | |||
runlog: true | |||
``` | |||
We recommend you create your own fork of this repository, and commit your branch to that fork. Otherwise, just committing your branch to a new github repository is a good way to track provenance and history of your work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems confusing to have this bit at the end of the instructions, rather than at the start - why not say to first create a fork, and then payu clone from that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how we've written it in other docs and is maybe our 'general' advice - @aidanheerdegen I think you have thoughts on this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also is creating a fork messy due to branching ? e.g. Can you fork and include one branch only ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also is creating a fork messy due to branching ? e.g. Can you fork and include one branch only ?
Yeah that was the main motivation. We're storing all the configs in a single repo for administrative convenience, but it isn't the best model for users. Additionally we have release
and dev
branches, which is a complication most users will not require.
I mean they can if they want, but typically I'd expect users would want to have a repo for related experiments, which would often be variations on a single configuration.
There might be some users who'd like to have a multi-configuration repository, say if they were looking at the effect of resolution on a process, but that's quite a bit rarer, and in any case they'd want to select what configs should be in there, and not just everything.
It's a cost of this model of organising model configurations, and part of the motivation of adding the capability to payu
to be able to specify a branch to clone directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow. I thought we wanted users to work from a fork so they can upload their experiment branches there (for sharing, safekeeping, provenance and journal data requirements)?
As far as I can see, the only options for forking are main branch only (which is no use at all), or everything (i.e. untick "Copy the main branch only"). It's apparently only possible to have one fork per user, so it makes sense to fork every branch so users have the option of running any config in future, using one repo for everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow. I thought we wanted users to work from a fork so they can upload their experiment branches there (for sharing, safekeeping, provenance and journal data requirements)?
Yes we want users to upload their experiments to a GitHub repository.
A fork is just one (GitHub) way to create a repository. It has the advantage of being quite convenient, and creates a strong link between the fork and upstream, which makes pull requests a bit more convenient.
This is not a particularly useful feature in this case. These configs are not really like code in this respect. For most users they will take a config and run it, creating an experiment. Or make some modification to the config and then run it, creating a perturbation experiment. An experiment repository is a history of a series of runs.
In most cases they won't be creating a PR back to this repo. And even if they did it would have to be carefully crafted so as not to include a bunch of run history.
And as you've established above, it is impossible to fork just the branches you want. It's pretty much all or nothing.
So I'm advocating using payu
to clone just the branch they want, run their experiment and use something like the gh
command line tool to upload their experiment repository to GitHub.
I covered how this works in the Workshop training, and it is pretty straightforward:
https://forum.access-hive.org.au/t/running-model-experiments-with-payu-and-git/2285
I think the result is a simpler repo branch structure with only branches they have created.
A potentially useful analogy: this repo is like a library, containing a number of books (configurations). It is possible to check out a copy of one of the books, and add chapters to it. It wouldn't make sense to force them to take a copy of all the books if they just want the one.
Fixes links and contributes to COSIMA/access-om3#221