Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review and update page: "Storing Source Code" #165

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
162 changes: 141 additions & 21 deletions source/standards/storing-source-code/index.html.md.erb
Original file line number Diff line number Diff line change
Expand Up @@ -6,33 +6,153 @@ old_paths:

# <%= current_page.data.title %>

All our source code is open by default, and stored on well-known,
public code hosting services. At the DfE, we use GitHub.

We follow the principles set out in the service manual for managing the
code that we write:
## GOV.UK Service Manual

- [use version control](https://www.gov.uk/service-manual/technology/maintaining-version-control-in-coding)
- [make source code open](https://www.gov.uk/service-manual/technology/making-source-code-open-and-reusable)
We follow principles set out within the [GOV.UK Service Manual](https://www.gov.uk/service-manual)
for managing code we write.

You should keep secrets separate from source code, and keep them private.
### Service Manual Sections of relevance:

## GitHub
- [Service Manual > Technology > Use version control](https://www.gov.uk/service-manual/technology/maintaining-version-control-in-coding)
- [Service Manual > Technology > Make source code open](https://www.gov.uk/service-manual/technology/making-source-code-open-and-reusable)
- [Service Manual > Technology > Securing your information](https://www.gov.uk/service-manual/technology/securing-your-information)

New repositories for products and services live in the
[Department for Education Digital organisation](https://github.com/DfE-Digital)
on GitHub. New repositories must be created within the [Department for Education Digital](https://github.com/DFE-Digital) organisation, whether they contain service production code or prototypes. Work created outside of the DfE Digital organisation should be transferred into the DfE Digital organisation at the earliest opportunity. [Guide to transferring a repository](https://help.github.com/en/articles/transferring-a-repository).
### Summary / Highlights:

You can use your personal GitHub account (but you should [add your DfE
email address to your account](https://help.github.com/articles/adding-an-email-address-to-your-github-account/),
and [use it for notifications](https://help.github.com/articles/managing-notification-emails-for-organizations/)).
Ask your delivery manager to request you being added to the Github organisation.
- Changes to source code _must_ be tracked
- Code we produce _should_ be made available via an internet source code repository
- Published code _should_ be under an Open Source initiative compatible licence
- Due-care _must_ be given to security considerations, including:
- Suitable protection of confidential information and secrets
- Departmental/governmental rules related to the use of cloud/3rd-party tooling
- Proper process and accountability/approvals for making code changes

Another Github organisation account heavilly used by the DfE but not the default for DfE-Digital is ['SkillsFundingAgency'](https://github.com/SkillsFundingAgency/).
Additional detail and information is available via the links above.

Repositories should be [clearly named](/standards/naming-things/),
and have an [appropriate licence](/standards/licencing-software-or-code)
and enough documentation that someone new can get started with the
project.
## Types of source code

Private repositories are not a good way to protect secrets, and should only be used where access to the code might reveal draft policy decisions. Secrets should be managed at the platform level.
Source code is broader and wider than just business and presentation code.

Examples of source code types and purposes:

- **Project source code**
- Code used to meet a user need - i.e., what is normally considered when describing "source code"
- **Test code**
- Code used to evaluate the correctness of the project code
- Depending on the project, test code may involve provisioning infrastructure, deploying a build,
and even running the project code
_(e.g. a headless browser to test the presentation and accessibility of a web page)_
- **Infrastructure as code**
- Code used to provision and configure the infrastructure a project runs upon
- **CI configuration**
- Code used to inspect, validate, and potentially gate-keep changes being made to project code
- May include GitHub Actions and Azure Pipelines
- Typically triggered on a merge/PR event, but other examples include being triggered on
creation of a particular tag (e.g., one in the format `vX.Y.Z`) or on a timer/cron-basis
- **Deployment code**
- Code used to build, test, and deploy project source code into a running environment

How this source code is stored and structured will vary by project, based on needs and historical convention.
For example:

- A project may be composed of multiple services where each has its own repository
- A monolith may have all source code for all purposes stored within the same source code repository
- A mixture may apply where project source code and tests are within one repository,
while infrastructure code may be stored within a separate repository

## Source Code Versioning: Git

At the Department for Education (DfE) we use [Git](https://git-scm.com/) for source code versioning.

- Git is decentralised - this means all copies of the repository include the WHOLE history of the repository,
RogerHowellDfE marked this conversation as resolved.
Show resolved Hide resolved
not just a snapshot
- Branches are "cheap" - creating a new branch (or tag) involves just a new pointer at a specific commit
(thus minimal compute and storage implications)
- Hashes/checksums for each file and commit depend on the entire tree - thus, the repository is safe from
surreptitious / malicious / accidental changes to earlier versions of a file without it being
very visible to other users

## Git Repository Hosting - GitHub and Azure DevOps (ADO)

While not required, most git users will nominate one copy of the git repository to be the authoritative copy.

- It is possible to self-host a git server for this purpose but, often, this will be a hosted solution such as
GitHub, Azure DevOps, GitLab, or any of the numerous other commercial services available.
- A "hub and spoke" is easier to reason about and keep synchronised
- Integrations with other tools will work with less friction, where they have a single copy to work with
(e.g., automated test/deployment tools, issue/bug management)

### GitHub

Historically, some projects use (and may remain on) private Azure DevOps and/or private GitHub repositories
for legacy reasons, though we are now required
to [make new source code open](https://apply-the-service-standard.education.gov.uk/service-standard/12-make-new-source-code-open.html).

Specifically, we use GitHub for new and migrated work.

### GitHub Organisations

Department for Education (DfE) source code repositories on GitHub should be stored under an appropriate
organisation, thereby giving appropriate oversight and protections to these source code repositories.

Specifically:

- The [Department for Education Digital](https://github.com/DfE-Digital)
GitHub organisation is used for new and existing source code repositories
- This is applicable to production and prototype code
- Work created outside the DfE Digital organisation should be transferred into
the DfE Digital organisation at the earliest opportunity.
- [GitHub: Guide to transferring a repository](https://help.github.com/en/articles/transferring-a-repository)
- The [Skills Funding Agency](https://github.com/SkillsFundingAgency/)
GitHub organisation is also used by the DfE
- Not the default for DfE-Digital

If your account is added to a repository without the account being a member of the owning organisation,
it will be counted and labelled as
an ["outside collaborator"](https://docs.github.com/en/organizations/managing-user-access-to-your-organizations-repositories/adding-outside-collaborators-to-repositories-in-your-organization).

To join a GitHub organisation, follow the guidance and request forms available
via [Digital Tools Support](<%= data.site.digital_tools %>).
As there is a small cost implication for accounts to be added to a GitHub organisation,
this should normally be done via / with support from your Delivery Manager.

### Repository Requirements

Repositories should be:

- [clearly named](/standards/naming-things/),
- have an [appropriate licence](/standards/licencing-software-or-code)
- have enough documentation that someone new can get started with the project

## Data Protection Considerations - Git Repositories

### Personal Data

Storage of a git repository must be treated with due care and consideration.
This applies whether it is within a central hosted environment or stored
elsewhere such on a developer's computer.

Places where we may normally find personally-identifiable information:

- Changes to source code, commits, are annotated with authorship details.
Typically, this is a real name (or username) and an email address.
- Where a commit is cryptographically-signed, the GPG key used will also have
personally-identifying information associated with it (such as an email address).

Additionally, note that git is _explicitly_ a _decentralised_ source versioning and control system.

- It is, therefore, not possible to delete/change information within one copy of the
repository (e.g., GitHub) and force all other copies to be updated also
- It is, therefore, extremely important to prevent non-public content from ever
being added to the git repository in the first place because it cannot be removed
with 100% confidence (being able to do so is an edge case, not the norm)

### Secrets

You _must_ keep secrets separate from source code, and keep them private.

Private repositories are a poor way to protect secrets, and may only be used
where access to the code might reveal draft policy decisions.

Secrets should be managed at the platform level.
Loading