Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include Git information during publishing? #100

Open
craig-willis opened this issue Dec 17, 2019 · 2 comments
Open

Include Git information during publishing? #100

craig-willis opened this issue Dec 17, 2019 · 2 comments

Comments

@craig-willis
Copy link
Collaborator

craig-willis commented Dec 17, 2019

If the workspace is a Git repo available via hosting service such as Github, during the publish process could we include information about the repo to make the connection between the research repo (Zenodo, Dataverse) and the Git repo? Combined with #99, this starts to be compelling even for a git user.

Basic workflow:

  • Create a tale based on your Git repo (init for now)
  • Build the tale image
  • Run your code
  • Publish to Zenodo + Docker with Git info

In the end, you get a Zenodo package that's connected to both your Git repo and a Dockerhub image with all three connected automatically.

@ThomasThelen
Copy link
Member

ThomasThelen commented Sep 23, 2020

When I imagine what a Tale looks like when it's connected with a git repo I see something like...

Tale_ID/
|-data/
|-workspace/
   |-cloned_git_repository
       |-.git

When it comes time to publish, I can see how it can still remain a git repository if we include the .git directory. We could enhance the Tale Manifest to include information about the repository; it should be possible to do with schema.org.

Exporting

When I think of what the Tale above looks like when exported, I imagine something like the following (bag files left out).

Tale_ID/
|-data/
    |-workspace/
       |-cloned_git_repository/
           |-.git

The repository is one folder deeper than what it is in the Tale. As long as the run.sh takes that into account, someone should be able to download a Tale and make git changes.

Publishing

Publishing works similarly to exporting, and should already be supported (need to check if we're publishing dot files).

Zenodo

The exported bags are published to Zenodo, which (from above) are capable of supporting reuse of the git repository.

DataONE

When publishing to DataONE, the individual files are pulled out of the bag. The locations are documented with prov:atLocation, which is how the filesystem is reconstructed at download. It might be worthwhile putting some of the RDF about the git repository in the resource map too.

@ThomasThelen
Copy link
Member

From the 2020-10-12 development meeting, we decided that we'd be publishing/exporting the entire .git folder. The git status is presumably at the head when the user exports or publishes. We also decided not to include this information in the RDF, although it may be possible to find an ontology for describing git repositories.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants