Skip to content

Commit

Permalink
Merge pull request #34 from NHSDigital/release/xz_enhance_add_virtual…
Browse files Browse the repository at this point in the history
…_environments_guidance

Added new improved guides on virtual environments
  • Loading branch information
xiyaozhuang authored Dec 6, 2022
2 parents 828a6a1 + 975cf3c commit 20b9dab
Show file tree
Hide file tree
Showing 10 changed files with 299 additions and 8 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -157,10 +157,11 @@ cython_debug/
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
.idea/

.DS_Store

# VSCode

.vscode/
/.idea/.gitignore
Binary file added docs/images/python_environment.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 3 additions & 3 deletions docs/implementing_RAP/technical-workflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -162,15 +162,15 @@ Inspiration has been drawn from:
[4]: ./notebooks_versus_ide_development.md
[5]: ./tools.md#conda-environment
[6]: ../training_resources/git/intro-to-git.md#the-gitignore-file
[7]: ../training_resources/python/virtual-environments.md#how-to-create-a-new-virtual-environment-using-conda
[8]: ../training_resources/python/virtual-environments.md#how-to-activate-an-environment
[7]: ../training_resources/python/virtual-environments/conda.md#how-to-create-a-new-virtual-environment-using-conda
[8]: ../training_resources/python/virtual-environments/conda.md#how-to-activate-an-environment
[9]: ./tools.md#the-terminal
[10]: ../training_resources/git/intro-to-git.md#common-basic-commands
[11]: ./tools.md#code-editing
[12]: ./tools.md#interactive-cells-in-vs-code
[13]: ./tools.md#interactive-python-notebooks
[14]: ../training_resources/python/python-functions.md
[15]: ./tools.md#linting-in-vs-code
[16]: ../training_resources/python/virtual-environments.md#conda-environment
[16]: ../training_resources/python/virtual-environments/conda.md#conda-environment
[17]: ../training_resources/git/using-git-collaboratively.md#resolving-merge-conflicts
[18]: ../training_resources/git/using-git-collaboratively.md
2 changes: 1 addition & 1 deletion docs/implementing_RAP/tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -255,5 +255,5 @@ Once you've converted the file, you can run the code as you would with any other

[1]: ../training_resources/git/intro-to-git.md#what-is-a-terminal
[2]: ../training_resources/git/intro-to-git.md
[3]: ../training_resources/python/virtual-environments.md
[3]: ../training_resources/python/virtual-environments/conda.md
[4]: ./notebooks_versus_ide_development.md
2 changes: 1 addition & 1 deletion docs/introduction_to_RAP/levels_of_RAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,4 +68,4 @@ _Meeting all of the above requirements, plus:_
[9]: ../implementing_RAP/code-review.md
[10]: ../training_resources/python/python-functions.md#documentation
[11]: ../training_resources/python/logging-and-error-handling.md
[12]: ../training_resources/python/virtual-environments.md
[12]: ../training_resources/python/virtual-environments/conda.md
2 changes: 1 addition & 1 deletion docs/training_resources/git/intro-to-git.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ Example of Windows vs Linux commands differences:
- display current directory - `pwd` (mac, linux) - `chpwd` or `%cd%` (Windows)
- List of other [command differences](https://www.geeksforgeeks.org/linux-vs-windows-commands/)

For more info on conda virtual environments see [here](../python/virtual-environments.md).
For more info on conda virtual environments see [here](../python/virtual-environments/conda.md).

## Setup for Git Basics exercise

Expand Down
161 changes: 161 additions & 0 deletions docs/training_resources/python/virtual-environments/conda.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
# Conda environment

Conda is the environment manager (and package manager) bundled with Anaconda, which is a commonly used distribution of Python, R and other applications. See [these instructions][install-anaconda] on how to install and setup Anaconda. Python packages (and other applications) in Conda are curated by the Anaconda team, however, there is also "conda-forge" which is a "channel" of their package repository which is managed by "the community", i.e. whoever made the packages.

You can interact with conda via the Anaconda Prompt for Windows, or in a terminal window for Mac/Linux. By inputting commands in the command prompt you can then create and manage your virtual environments using the conda package manager.

See the [Anaconda user guides][conda-getting-started] for more information on getting started with conda.

## How to create a new virtual environment using conda

To create a new conda virtual environment for your project, open the Anaconda Prompt (Windows) or a terminal window on Mac/Linus and enter:

```conda
conda create --name myenvironment python=3.9
```

- The `--name` tag specifies a name for the environment: in this example the environment will be named "myenvironment", but you can replace this will something better suited to your project.
- `python=3.9` specifies python version you wish the virtual environment to run, in this case version 3.9.

To check the packages that are installed in the [active environment](#how-to-activate-an-environment), enter:

```conda
conda list
```

To install a package the active environment:

```conda
conda install pandas
```

To create an environment, specify the Python version and install multiple packages in **one line**:

```conda
conda create --name mynewenvironment python=3.8 pandas 3.1.0 flake8 3.9.2 numpy
```

Notice how the versions for pandas and flake8 are specified but no version is given for numpy. This will result in all versions of numpy being installed.

- **TIP:** It is **recommended** to install all packages in one go (e.g. numpy, pandas, pytest etc). Installing 1 package at a time could cause potential package dependency conflicts (see [Dependency Hell][dependency-hell]).

If you're unsure a specific package is installed in the current environment simply search:

```conda
conda search flake8
```

For more information on managing environments and other commands such as updating your environment's packages, check out [Managing environments with Conda][managing-conda-envs].

## How to activate an environment

The 'active' environment is the one that conda will reference when you enter any commands, e.g. new packages will be installed into the active environment, the installed package list will be based on the active environment, etc.

When you start a new Anaconda Prompt (Windows) or open a new terminal (Mac/Linux), the active current environment is set to `base`.

To activate an environment, enter:

```conda
conda activate mynewenvironment
```

As above, "myenvironment" specifies the name of the environment and can be replaced with the specific name of the environment you wish to activate.

## How to export and share environments

To ensure reproducibility, it is important that we can export virtual environments and share them alongside the code. In this way, someone else will be able to run your code with exactly the same environment (packages, dependencies etc) as you did. This helps address the classic 'it works on my machine' problem.

There are a few options for how to export environments and recreated environments from exported files.

### Pip requirements.txt

If you followed the [Project structure and package organisation][1] guide, you will have created a `requirements.txt` file in your repository, which specifies all the python packages you wish to install.

- The benefit of a requirements.txt file created with Pip is that anyone with a Python installation should be able to install it (i.e. someone else wouldn't need to have conda installed)
- The drawback is that the requirements.txt file offers an incomplete specification for the environment (for example, it does not specify the python version), so projects using requirements.txt files must be careful to specify any additional dependencies in another way (e.g. via a README).

To create a working conda environment using the requirements.txt file, simply follow the conda environment creation commands from above and instead of the simple package installation, first activate the target environment:

```conda
conda activate mynewenviroment
```

And then enter:

```conda
conda install --file requirements.txt
```

### Conda environment.yml

Conda offers a way to export and share environments via a yaml file.

- The benefit of this approach is that the output file gives a more complete picture of the dependencies for a project than a requirements.txt file from Pip.
- The drawback is the someone would need to have conda installed to use your project: if you work in an organisation or team that consistently uses conda then this is not as relevant, but it may be more relevant if you want to distribute you project to others who may not use conda.

To export the active environment:

```conda
conda env export > environment.yml
```

The resulting file can be used to completely rebuild a conda environment:

```conda
conda env create --file environment.yml
```

## How to remove an environment

Make sure your environment is not active by typing:

```conda
conda deactivate
```

This will take you back to the base conda environment. Then to delete your specified environment:

```conda
conda env remove --name mynewenvironment
```

## Conda help command

Type the following command for a list of helpful terminal commands:

```conda
conda env --help
```

## See Also

### Using Spyder with conda environments

Spyder is a Python IDE that is bundled with Anaconda during installation. It can be tricky to set up Spyder to work with multiple conda environments: see [this guide][spyder-conda-envs] for instructions on how to do this.

### Using Docker with Data Refinery

[Docker][docker-getting-started] is a container manager that offers many solutions and applications for managing your environments. Currently at NHS Digital we cannot use Docker due to compatibility issues. This may change in the future so we will update the resources on this page accordingly.

## External links

- [Python virtual environments and package][python-venvs]
- [Installing Anaconda][install-anaconda]
- [Getting started with Conda][conda-getting-started]
- [Managing environments with Conda][managing-conda-envs]
- [Conda commands cheatsheet][conda-cheatsheet]
- [List of other package managers][package-managers]
- [Using up Spyder with conda environments][spyder-conda-envs]

*NHS Digital is not affiliated with any of these websites or companies.*

[python-venvs]: https://docs.python.org/3/tutorial/venv.html
[install-anaconda]: https://docs.anaconda.com/anaconda/install/index.html
[conda-getting-started]: https://conda.io/projects/conda/en/latest/user-guide/getting-started.html
[managing-conda-envs]: https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands
[conda-cheatsheet]: https://conda.io/projects/conda/en/latest/user-guide/cheatsheet.html
[package-managers]: https://en.wikipedia.org/wiki/List_of_software_package_management_systems
[docker-getting-started]: https://docs.docker.com/get-started/overview/
[dependency-hell]: https://en.wikipedia.org/wiki/Dependency_hell
[spyder-conda-envs]: https://github.com/spyder-ide/spyder/wiki/Working-with-packages-and-environments-in-Spyder
[1]: ../project-structure-and-packaging.md
79 changes: 79 additions & 0 deletions docs/training_resources/python/virtual-environments/venv.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Venv environment

The venv module is part of python's standard library. It supports creating lightweight virtual environments, each with their own independent set of packages installed in their site directories. A virtual environment is created on top of an existing python installation, known as the virtual environment’s “base” python, and may optionally be isolated from the packages in the base environment, so only those explicitly installed in the virtual environment are available.

When used from within a virtual environment, common installation tools such as pip will install python packages into a virtual environment without needing to be told to do so explicitly.

## How to create a virtual environment using venv
To create a new venv environment for your project, open a terminal and enter:

=== "Windows PowerShell"

```
py -<python-version> -m venv <venv-directory>
```

=== "Linux/macOS"

```
python<python-version> -m venv <venv-directory>
```

!!! note

- The optional `<python-version>` tag specifies the python version you wish the virtual environment to run. If omitted, it will default to the latest python version installed on your system.

- `<venv-directory>` is required and it specifies the directory in which you want to create your virtual environment.

- In VS Code you can create a virtual environment by using the command `Python: Create Environment` available in the command pallete `CTRL + SHIFT + P`. This will also automatically install all the packages in a `requirements.txt` file.

## How to activate a venv environment
To activate a venv environment open a terminal and enter:

=== "Windows PowerShell"

```
<venv-directory>/Scripts/activate
```

!!! note

- On Windows you may be unable to run the activation script due to [execution policy](https://learn.microsoft.com/en-gb/powershell/module/microsoft.powershell.core/about/about_execution_policies?view=powershell-7.3)

- If you would like to use the Windows Command Prompt instead, the identification of file paths are specified with a backslash rather than a forward slash. It is recommended to use PowerShell as it can use either method.

=== "Linux/macOS"

```
source <venv-directory>/bin/activate
```

## How to deactivate a venv environment
To deactivate the active venv environment open a terminal and enter:

```
deactivate
```

## How to install packages into a venv environment
To install packages to a venv environment using a `requirements.txt` file, first activate the target environment. For example, using Windows PowerShell:

```
<venv-directory>/Scripts/activate
```

And then enter:

```
py -m pip install -r requirements.txt
```

## External links

- [Installing packages using pip and virtual environments][python-venvs]
- [Python Virtual Environments: A Primer][virtual-env-primer]

[python-venvs]: https://packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments/
[virtual-env-primer]: https://realpython.com/python-virtual-environments-a-primer/

*NHS Digital is not affiliated with any of these websites or companies.*
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Virtual environments

## What are virtual environments?

A virtual environment is a tool that helps to keep dependencies required by different projects separate by creating isolated python environments for each project. This is one of the most important tools that most python developers use.

## Why use virtual environments?

Virtual environments are a way to make sure your code will run correctly for you and others. By always coding inside a virtual environment, you make it more likely that your work will be usable by others.

If someone tries to run your code but they are using a different version of python then it might fail. Likewise, if your code depends on some packages but your users have a different version of that package installed that might also fail.

Worse than this - if you have multiple projects then one project depends on 'my_example_library_v1' while another project needs to use 'my_example_library_v2' then both projects will break. Sometimes the code you have might depend on outdated versions of a package as the latest package update introduces bugs and issues. Or, you might have multiple versions of code that need different versions of packages to run.

![xkcd comic demonstrating a messy python environment](../../../images/python_environment.png)

Virtual environments address these situations by keeping all of the packages and versions for each project separate. I can create a virtual environment called 'project-1-env' that uses python 2.7 and 'my_example_library_v1'. I can create a second virtual environment called 'project-2-env' that uses python 3.8 and 'my_example_library_v2'. As you move from working on one project to another, you just need to switch to the environment associated with that work.

It is good practice to always code inside a virtual environment.

## How to create a virtual environment

There are many ways to create virtual environments in python. This guide will go through the essentials for [creating a virtual environment with conda][create-venv-with-conda] and for [creating a virtual environment with venv][create-venv-with-venv].

## What is conda and venv?

Conda is an open source package and environment management system. It was created for python programs, but it can package and distribute software for any language. The conda package and environment manager is included in all versions of Anaconda and Miniconda.

- Conda is associated with the Anaconda package repository but, conda can use `pip` and pull from PyPI if needed.

- The advantage of using conda is that you can export and share environments using an `environments.yml` file which specifies the environment completely, including python version.

The venv module is part of python’s standard library and supports creating virtual environments, each with their own independent set of python packages installed in their site directories.

- Venv is associated with the PyPI package repository.

- The advantage of using venv is that it is ready to use with a fresh install of python with no prerequisites, whereas to use conda you must first install a version of Anaconda.

!!! note

You may not have a choice about which environment you can use. Your organisation may choose to use one over the other, and this might be dictated by their choice of systems, i.e. if they have chosen to make a "PyPi" mirror using "Bandersnatch", you will need to use venv.

[create-venv-with-conda]: ./conda.md
[create-venv-with-venv]: ./venv.md
8 changes: 7 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,10 @@ nav:
- Handling file paths: training_resources/python/handling-file-paths.md
- Logging and error handling in Python: training_resources/python/logging-and-error-handling.md
- Project structure and packaging: training_resources/python/project-structure-and-packaging.md
- Virtual environments: training_resources/python/virtual-environments.md
- Virtual environments:
- Why use virtual environments?: training_resources/python/virtual-environments/why-use-virtual-environments.md
- Venv: training_resources/python/virtual-environments/venv.md
- Conda: training_resources/python/virtual-environments/conda.md
- Unit testing: training_resources/python/unit-testing.md
- Unit testing field definitions: training_resources/python/unit-testing-field-definitions.md
- Back testing: training_resources/python/backtesting.md
Expand All @@ -72,6 +75,7 @@ theme:
features:
- search.share
- content.code.annotate
- content.tabs.link
icon:
admonition:
<type>: material/alert
Expand All @@ -91,6 +95,8 @@ markdown_extensions:
- pymdownx.inlinehilite
- pymdownx.snippets
- pymdownx.superfences
- pymdownx.tabbed:
alternate_style: true
- admonition
- pymdownx.details
- pymdownx.critic
Expand Down

0 comments on commit 20b9dab

Please sign in to comment.