Skip to content

Commit

Permalink
Merge pull request #23 from t4d-gmbh/17-2nd-proofreading-round-part-4
Browse files Browse the repository at this point in the history
17 2nd proofreading round part 4
  • Loading branch information
e-BaMaMe authored Nov 1, 2024
2 parents 09e28da + 65d72c0 commit b1bbe01
Show file tree
Hide file tree
Showing 12 changed files with 117 additions and 114 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Git in Science

This is T4D's practical guide for professionals and hobbyists that want to
learn [more] about how [git](https://git-scm.com/) and related remote services
allow to enhance the quality of scientific work.
This is T4D's practical guide for professionals and hobbyists who want to
learn [more] about how [Git](https://git-scm.com/) and related remote services
can enhance the quality of scientific work.

<!-- include-before -->

Expand Down
14 changes: 7 additions & 7 deletions source/content/intro/historical_perspective.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
## A Historical <i class="fa-brands fa-pied-piper-alt"></i> Perspective

{% if page %}
<i class="fab fa-git"></i>, a distributed version control system, was created by Linus Torvalds in 2005 to manage the development of the Linux <i class="fa-brands fa-linux"></i> kernel.
Before <i class="fab fa-git"></i>, the Linux <i class="fa-brands fa-linux"></i> kernel project used a proprietary system called BitKeeper, but due to licensing issues, the need for an open-source alternative became apparent.
<i class="fab fa-git"></i>, a distributed version control system, was created by Linus Torvalds in 2005 to manage the development of the Linux kernel.
Before <i class="fab fa-git"></i>, the Linux <i class="fa-brands fa-linux"></i> kernel project used a proprietary system called BitKeeper, but licensing issues highlighted the need for an open-source alternative.
Torvalds designed <i class="fab fa-git"></i> to be fast, efficient, and capable of handling large projects with a distributed workflow.

Initially, <i class="fab fa-git"></i> was primarily used by software developers, but its robust features soon caught the attention of other fields, including scientific research.
The scientific community recognized the potential of <i class="fab fa-git"></i> for managing complex projects, tracking changes, and facilitating collaboration.
Here’s how <i class="fab fa-git"></i> evolved to become a staple in science:
The scientific community recognized the <i class="fab fa-git"></i>'s potential for managing complex projects, tracking changes, and facilitating collaboration.
Here’s how <i class="fab fa-git"></i> evolved to become a powerful tool in science:

- **Adoption by Open-Source Projects:** <i class="fab fa-git"></i>’s success in open-source software projects demonstrated its capabilities in managing collaborative work, which is a common requirement in scientific research.
- **Adoption by Open-Source Projects:** <i class="fab fa-git"></i>’s success in open-source software projects demonstrated its capabilities in managing collaborative work, which is also highly relevant in scientific research.
- **Integration with Platforms:** The rise of platforms like <i class="fab fa-git"></i>Hub and <i class="fab fa-git"></i>Lab provided user-friendly interfaces and additional features such as issue tracking, project management, and collaborative tools. These platforms made <i class="fab fa-git"></i> more accessible to non-developers, including scientists.
- **Reproducibility and Transparency:** As the importance of reproducibility in scientific research grew, <i class="fab fa-git"></i>’s ability to maintain a detailed history of changes became invaluable.
- **Interdisciplinary Collaboration:** Modern scientific research often involves interdisciplinary teams. <i class="fab fa-git"></i>’s collaborative features facilitated seamless cooperation between computer scientists, biologists, physicists, and other researchers, breaking down barriers between disciplines.
- **Reproducibility and Transparency:** With a growing importance of reproducibility in scientific research, <i class="fab fa-git"></i>’s ability to maintain a detailed history of changes proved invaluable for documenting and replicating research.
- **Interdisciplinary Collaboration:** Modern scientific research often involves interdisciplinary teams. <i class="fab fa-git"></i>’s collaborative features enabled seamless cooperation between computer scientists, biologists, physicists, and other researchers, and fostering collaboration across different fields.

Today, <i class="fab fa-git"></i> is widely used in scientific research for version control, collaboration, and ensuring the reproducibility of computational experiments.
Its evolution from a tool for software development to a cornerstone of scientific research highlights its versatility and the growing intersection between technology and science.
Expand Down
31 changes: 15 additions & 16 deletions source/content/lfs/comparison-git-gitlfs.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,33 +4,32 @@

### <i class="fab fa-git"></i>

When you commit changes with <i class="fab fa-git"></i>, it creates objects to represent the state of your files at that point in time.
These objects are stored in the `.git/objects` directory.
When you commit changes with <i class="fab fa-git"></i>, it creates objects to represent the state of your files at that point in time, which are stored in the `.git/objects` directory.
Each object is a snapshot of the file contents, and <i class="fab fa-git"></i> uses pointers (hashes) to reference these objects.

<i class="fab fa-git"></i> is optimized for text files.
Text files often have small, incremental changes, that <i class="fab fa-git"></i> can handle well by storing only the differences (deltas) between versions.
<i class="fab fa-git"></i> is optimized for handling text files.
Since text files typically have small, incremental changes, <i class="fab fa-git"></i> efficiently stores only the differences (deltas) between versions.

Changes in binary files (e.g., images, videos, datasets) are different from text files because changes are not easily represented as deltas.
When a binary file changes, <i class="fab fa-git"></i> often stores the entire file again, leading to repository bloat and slow performance.
However, changes in binary files (e.g., images, videos, datasets) are not as easily represented as deltas.
When a binary file is modified, <i class="fab fa-git"></i> often stores the entire file again, leading to bloated repositories and slower performance over time.

### <i class="fab fa-git"></i> LFS

<i class="fab fa-git"></i> LFS replaces large files with small pointer files.
These pointer files reference the actual content stored outside the main repository.
<i class="fab fa-git"></i> LFS replaces large files with small pointer files that
reference the actual content stored outside the main repository.

The actual large files are stored in a seperate location (e.g., a remote server) which keeps the main repository lightweight and efficient.
When you clone a repository with <i class="fab fa-git"></i> LFS, you only download the pointer files, not the actual large files.
When you check out a file, <i class="fab fa-git"></i> LFS automatically downloads the large file content.
Similarly, when you commit a large file, <i class="fab fa-git"></i> uploads the large file to the external storage and replaces it with a pointer file in the repository.
The large files themselves are stored in a seperate location (e.g., a remote server) which keeps the main repository lightweight and efficient.
When you clone a repository using <i class="fab fa-git"></i> LFS, only the pointer files are downloaded, not the large files.
When you checkout a file, <i class="fab fa-git"></i> LFS automatically downloads the actual file content.
Similarly, when you commit a large file, <i class="fab fa-git"></i> uploads it to the external storage and replaces it with a pointer file in the repository.

{% else %}

| <i class="fab fa-git"></i> | <i class="fab fa-git"></i> LFS |
|:---:|:-------:|
| Optimized for text files | Optimized for large files |
| Stores entire file contents | Stores large files externally |
| Slows down with large files | Maintains repository performance |
| Bloated repositories | Lightweight repositories |
| Stores entire file contents locally | Stores large files externally |
| Performance slows with large files | Maintains repository performance with large files |
| Leads to bloated repositories | Keeps repositories lightweight |

{% endif %}
{% endif %}
25 changes: 11 additions & 14 deletions source/content/lfs/gitLFS_UZH.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,28 +3,25 @@
{% if build == "pages" %}
Popular <i class="fab fa-git"></i> hosting services like GitHub, GitLab, Bitbucket, and Azure DevOps have built-in support for <i class="fab fa-git"></i> LFS.

In self-hosted <i class="fab fa-git"></i> server, it must be ensured that they have <i class="fab fa-git"></i> LFS support enabled which might come with installing and configuring additional software.
For self-hosted <i class="fab fa-git"></i> servers, it is important to ensure <i class="fab fa-git"></i> LFS support is enabled. This may require additional installation and configuration because <i class="fab fa-git"></i> LFS stores large files in a separate storage location, which requires extra server-side support management (e.g., storage, authentication, bandwidth handling).

This is because <i class="fab fa-git"></i> LFS stores large files in a separate storage location, which requires additional server-side support to manage the large files (e.g., storage, authentication, bandwidth, etc.).
At the IMATH, <i class="fab fa-git"></i> LFS is **NOT supported** on the [IMATH GitLab](https://gitlab.imath.uzh.ch) instance.

At the IMATH, <i class="fab fa-git"></i> LFS is NOT supported on the [IMATH GitLab](https://gitlab.imath.uzh.ch) instance.
However, at the University of Zurich (UZH), <i class="fab fa-git"></i> LFS is supported on the [UZH GitLab](https://gitlab.uzh.ch) instance, with a **size limit of 15 GB per project** (this includes all parts of a project, i.e. <i class="fab fa-git"></i> repository, LFS, etc.).
The data is stored in the [Switch Cloud](https://www.switch.ch/en/competencies/cloud), which is hosted outside of UZH but remains within Switzerland, though generally the UZH data protection regulations still apply.

At the University of Zurich (UZH), <i class="fab fa-git"></i> LFS is supported on the [UZH GitLab](https://gitlab.uzh.ch) instance.
There is a **size limit of 15 GB per project** (this includes all parts of a project, i.e. <i class="fab fa-git"></i> repository, LFS, etc.).
The data is stored in the [Switch Cloud](https://www.switch.ch/en/competencies/cloud), which means outside the UZH but within Switzerland (generally the UZH data protection regulations apply).

Please note that GitLab is generally a place for collaborative software development rather than simply data storage.
For example, UZH has limited disk space and deletes all data after 12 months of user inactivity.
Please note that GitLab is primarily intended for collaborative software development, not simply for data storage.
UZH's GitLab has limited disk space, and all data is deleted after 12 months of user inactivity.
For long-term data storage, it is recommended to use [OneDrive](https://uzh-my.sharepoint.com/my) or [SwitchDrive](https://drive.switch.ch/).

{% else %}

- <i class="fab fa-git"></i> hosting services like GitHub, GitLab, and Bitbucket support <i class="fab fa-git"></i> LFS.
- Self-hosted <i class="fab fa-git"></i> servers need additional configuration for <i class="fab fa-git"></i> LFS.
- <i class="fab fa-git"></i> LFS is _not_ supported on the [IMATH GitLab instance](https://git.math.uzh.ch/).
- Popular <i class="fab fa-git"></i> hosting services like GitHub, GitLab, and Bitbucket support <i class="fab fa-git"></i> LFS.
- Self-hosted <i class="fab fa-git"></i> servers require additional configuration to enabel <i class="fab fa-git"></i> LFS.
- <i class="fab fa-git"></i> LFS is **not supported** on the [IMATH GitLab instance](https://git.math.uzh.ch/).
- <i class="fab fa-git"></i> LFS is supported on the [UZH GitLab instance](https://gitlab.uzh.ch/).
- **Size Limit**: 15 GB per project.
- **Storage**: [Switch Cloud](https://www.switch.ch/en/competencies/cloud) (Switzerland).
- **Data Storage**: [Switch Cloud](https://www.switch.ch/en/competencies/cloud) (Switzerland).
- **Data Retention**: Data is deleted after 12 months of inactivity.

{% endif %}
{% endif %}
2 changes: 1 addition & 1 deletion source/content/lfs/index.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# <i class="fab fa-git"></i> Large File System (LFS)
# <i class="fab fa-git"></i> Large File Storage (LFS)
{% if slide %}
<!-- BUILDING THE SLIDES -->
```{toctree}
Expand Down
18 changes: 9 additions & 9 deletions source/content/lfs/why-use-git-lfs.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,20 @@
## Why Using <i class="fab fa-git"></i> LFS?
## Why Use <i class="fab fa-git"></i> LFS?

{% if build == "pages" %}
Data management is a critical aspect of scientific research and is often not versioned or tracked leading to data loss, duplication, or errors.
Effective data management is crucial in scientific research, yet it's often overlooked, leading to data loss, duplication, or errors.

Tracking large files with <i class="fab fa-git"></i> can be challenging.
As the size of your repository grows, it can slows down the <i class="fab fa-git"></i> operations, such as cloning, fetching, and pushing because <i class="fab fa-git"></i> stores the entire history of the repository locally.
To address this issue, <i class="fab fa-git"></i> Large File Storage (LFS) provides a solution for managing large files in your <i class="fab fa-git"></i> repositories.
As the size of your repository grows, operations like cloning, fetching, and pushing can slow down because <i class="fab fa-git"></i> stores the entire repository's history locally.
To address this, <i class="fab fa-git"></i> Large File Storage (LFS) provides a solution for managing large files in your <i class="fab fa-git"></i> repositories.

Git Large File Storage (LFS) is a <i class="fab fa-git"></i> extension that replaces large files in your repository with text pointers while storing the file contents on a remote server.
This approach allows you to work with large files in your repository without slowing down the <i class="fab fa-git"></i> operations.
<i class="fab fa-git"></i>-LFS is an extension that replaces large files in your repository with lightweight text pointers, while the actual file contents are stored on a remote server.
This allows you to work with large files without affecting <i class="fab fa-git"></i> performance.

{% else %}

- **Data Management**: Proper data management is crucial in scientific research to prevent data loss, duplication, and errors.
- **Data Management**: Proper data management is essential to avoid data loss, duplication, and errors in scientific research.
- **Efficiency**: Traditional <i class="fab fa-git"></i> struggles with large files, leading to slow performance and bloated repositories.
- **Collaboration**: Ensures team members can work with large files without conflicts or performance issues.
- **History Tracking**: Maintains a history of changes, making it easier to revert to previous versions if needed.
- **Collaboration**: Enables team members to handle large files without conflicts or performance issues.
- **History Tracking**: Maintains a history of changes, making it easier to revert to previous versions when needed.

{% endif %}
49 changes: 29 additions & 20 deletions source/content/project_mgmt_tools/key_features_and_benefits.md
Original file line number Diff line number Diff line change
@@ -1,50 +1,59 @@
## Key Features and Benefits
:::{card}
**Version Control with <i class="fab fa-git"></i>'s Feature Branch Workflow** {% if page %}
GitHub<i class="fab fa-github"></i> and GitHub<i class="fab fa-gitlab"></i> allow teams to track changes, revert to previous versions, and collaborate on code seamlessly.
### Version Control with <i class="fab fa-git"></i>'s Feature Branch Workflow
{% if page %}
GitHub<i class="fab fa-github"></i> and GitHub<i class="fab fa-gitlab"></i> allow teams to track changes, revert to previous versions, and collaborate seamlessly on code.
This ensures that all modifications are documented, which is crucial for **transparency and accountability**.
{% else %}
Track changes, revert to previous versions, etc. crucial for **transparency and accountability**.
Track changes, revert to previous versions, and seamless collaboration are crucial for **transparency and accountability**.
{% endif %}
:::

:::{card}
**Issue{octicon}`issue-opened;0.8em` Tracking** {% if page %}
Issues help manage tasks, bugs, and feature requests.
By documenting issues and their resolutions, teams can maintain a clear record of project progress and decision-making processes.

### Issue Tracking
{% if page %}
Issues help teams to manage tasks, bugs, and feature requests efficiently. By documenting issues and their resolutions, teams can keep a clear record of project progress and decision-making processes.
{% else %}
Maintain a clear record of project progress and decision-making processes.
Efficiently manage tasks, bugs, and feature requests, while maintaining a clear record of project progress and decision-making.
{% endif %}
:::

:::{card}
**Code Review** {% if page %}
Pull (<i class="fab fa-github"></i>) and Merge Requests (<i class="fab fa-gitlab"></i>) facilitate peer review of code changes before they are integrated into the main codebase. Code reviews help catch errors, enforce coding standards, and ensure that contributions meet the project’s quality requirements.

### Code Review
{% if page %}
Pull (<i class="fab fa-github"></i>) and Merge Requests (<i class="fab fa-gitlab"></i>) facilitate peer review of code changes before they are integrated into the main codebase. Code reviews help catch errors, enforce coding standards, and ensure that contributions meet the project’s quality standards.
{% else %}
Help catch errors, enforce coding standards, and ensure contributions meet quality requirements.
<i class="fab fa-git"></i> tools help catch errors, enforce coding standards, and ensure contributions meet quality requirements.
{% endif %}
:::

:::{card}
**Continuous Integration/Continuous Deployment (CI/CD)** {% if page %}
GitHub<i class="fab fa-github"></i> Actions and GitHub<i class="fab fa-gitlab"></i> CI/CD provide automated testing and deployment pipelines to ensure that code changes are tested and validated before being merged. This reduces the risk of introducing errors and ensures that the codebase remains stable and reliable.

### Continuous Integration/Continuous Deployment (CI/CD)
{% if page %}
GitHub<i class="fab fa-github"></i> Actions and GitLab<i class="fab fa-gitlab"></i> CI/CD provide automated testing and deployment pipelines. These tools ensure that code changes are tested and validated before being merged, reducing risk of introducing errors and ensuring that the codebase remains stable and reliable.
{% else %}
Reduce risk of errors and ensure codebase stability.
Automated testing and deployment pipelines reduce risk of errors and ensure codebase stability.
{% endif %}
:::

:::{card}
**Documentation** {% if page %}
Wikis and README Files on both platforms support extensive documentation. Proper documentation is essential for reproducibility, allowing others to understand and replicate the work.
### Documentation
{% if page %}
Wikis and README files on both platforms support comprehensive documentation. Proper documentation is essential for reproducibility, allowing others to understand and replicate the work.
{% else %}
Essential for reproducibility, allowing others to understand and replicate the work.
Comprehensive documentation is essential for reproducibility, allowing others to understand and replicate the work.
{% endif %}
:::

:::{card}
**Collaboration and Communication** {% if page %}
Comments and Discussions on GitHub<i class="fab fa-github"></i> and GitHub<i class="fab fa-gitlab"></i> provide features for commenting on code, issues, and merge requests. This facilitates communication among team members, ensuring that everyone is on the same page and that decisions are well-documented.

### Collaboration and Communication
{% if page %}
Comments and Discussions on GitHub<i class="fab fa-github"></i> and GitLab<i class="fab fa-gitlab"></i> faciliate communication by allowing users to comment on code, issues, and merge requests. This ensures that team members stay aligned and that key decisions are well-documented.
{% else %}
Facilitate communication among team members, ensuring everyone is on the same page and decisions are well-documented.
Facilitate communication by enabling comments on code, issues, and merge requests, keeping team members aligned and ensuring key decisions are well documented.
{% endif %}
:::
:::
Loading

0 comments on commit b1bbe01

Please sign in to comment.