Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README.md #134

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Update README.md #134

wants to merge 2 commits into from

Conversation

anton-seaice
Copy link
Contributor

Fixes links and contributes to COSIMA/access-om3#221

README.md Outdated Show resolved Hide resolved
Co-authored-by: Andrew Kiss <[email protected]>
@@ -94,6 +76,7 @@ to
```yaml
runlog: true
```
We recommend you create your own fork of this repository, and commit your branch to that fork. Otherwise, just committing your branch to a new github repository is a good way to track provenance and history of your work.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems confusing to have this bit at the end of the instructions, rather than at the start - why not say to first create a fork, and then payu clone from that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how we've written it in other docs and is maybe our 'general' advice - @aidanheerdegen I think you have thoughts on this ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also is creating a fork messy due to branching ? e.g. Can you fork and include one branch only ?

Copy link
Member

@aidanheerdegen aidanheerdegen Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also is creating a fork messy due to branching ? e.g. Can you fork and include one branch only ?

Yeah that was the main motivation. We're storing all the configs in a single repo for administrative convenience, but it isn't the best model for users. Additionally we have release and dev branches, which is a complication most users will not require.

I mean they can if they want, but typically I'd expect users would want to have a repo for related experiments, which would often be variations on a single configuration.

There might be some users who'd like to have a multi-configuration repository, say if they were looking at the effect of resolution on a process, but that's quite a bit rarer, and in any case they'd want to select what configs should be in there, and not just everything.

It's a cost of this model of organising model configurations, and part of the motivation of adding the capability to payu to be able to specify a branch to clone directly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow. I thought we wanted users to work from a fork so they can upload their experiment branches there (for sharing, safekeeping, provenance and journal data requirements)?

As far as I can see, the only options for forking are main branch only (which is no use at all), or everything (i.e. untick "Copy the main branch only"). It's apparently only possible to have one fork per user, so it makes sense to fork every branch so users have the option of running any config in future, using one repo for everything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow. I thought we wanted users to work from a fork so they can upload their experiment branches there (for sharing, safekeeping, provenance and journal data requirements)?

Yes we want users to upload their experiments to a GitHub repository.

A fork is just one (GitHub) way to create a repository. It has the advantage of being quite convenient, and creates a strong link between the fork and upstream, which makes pull requests a bit more convenient.

This is not a particularly useful feature in this case. These configs are not really like code in this respect. For most users they will take a config and run it, creating an experiment. Or make some modification to the config and then run it, creating a perturbation experiment. An experiment repository is a history of a series of runs.

In most cases they won't be creating a PR back to this repo. And even if they did it would have to be carefully crafted so as not to include a bunch of run history.

And as you've established above, it is impossible to fork just the branches you want. It's pretty much all or nothing.

So I'm advocating using payu to clone just the branch they want, run their experiment and use something like the gh command line tool to upload their experiment repository to GitHub.

I covered how this works in the Workshop training, and it is pretty straightforward:

https://forum.access-hive.org.au/t/running-model-experiments-with-payu-and-git/2285

I think the result is a simpler repo branch structure with only branches they have created.

A potentially useful analogy: this repo is like a library, containing a number of books (configurations). It is possible to check out a copy of one of the books, and add chapters to it. It wouldn't make sense to force them to take a copy of all the books if they just want the one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants