Skip to content

Commit 3489704

Browse files
committed
Describe how to build pipelines package with pip
1 parent 2a9279a commit 3489704

File tree

2 files changed

+242
-0
lines changed

2 files changed

+242
-0
lines changed

index.rst

+1
Original file line numberDiff line numberDiff line change
@@ -326,6 +326,7 @@ ReStructuredText
326326
stack/eups-tutorial
327327
stack/lsstsw
328328
stack/adding-a-new-package
329+
stack/building-with-pip
329330
stack/moving-to-github-actions
330331
stack/license-and-copyright
331332
stack/packaging-third-party-eups-dependencies

stack/building-with-pip.rst

+241
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,241 @@
1+
###################################
2+
Making Your Package pip Installable
3+
###################################
4+
5+
By default a science pipelines package will be configured to work with SCons and EUPS but not with ``pip``.
6+
If the package does not include C++ code and does not depend on any of the C++ packages, such as ``afw`` or ``daf_base`` then it is possible to make the package installable with ``pip`` as well as with SCons.
7+
8+
.. note::
9+
10+
C++ packages can be built with ``pip`` but when packages are built this way they can't easily be used as dependencies for C++ code from other packages.
11+
The `lsst-sphgeom`_ PyPI package is a C++ package but can not be used to build ``afw`` because the include files and shared libraries are not part of the installation.
12+
Only the Python interface is available.
13+
14+
Configuring the Package
15+
=======================
16+
17+
All the configuration for a modern Python package lives in a file called ``pyproject.toml``.
18+
There are a number of sections that need to be created for ``pip`` to work properly.
19+
20+
The Build Requirements
21+
----------------------
22+
23+
.. code-block:: toml
24+
25+
[build-system]
26+
requires = ["setuptools", "lsst-versions >= 1.3.0"]
27+
build-backend = "setuptools.build_meta"
28+
29+
This section tells ``pip`` what to install before the build process can even begin.
30+
Usually you will find that a package will use ``setuptools_scm`` to determine the version number.
31+
This doesn't really work for science pipelines packages since tags are applied by the pipelines build procedure and can not be set by the individual package owner.
32+
This means that semantic versions can't be used but it also means that if we want packages to be published on a cadence smaller than every 6 months we can not rely solely on the formal release tags to appear.
33+
34+
The `lsst-versions`_ package is used to work around this restriction by determining version numbers based on the most recent formal version and the current weekly tag.
35+
For example, in ``daf_butler`` the version that was uploaded to `PyPI`_ for tag ``w.2023.42`` was ``26.2023.4200`` (where v26 was the most recent formal release tag at the time).
36+
The trailing ``00`` in the version number is used to allow different version numbers to be determined on ticket branches whilst developing the package between weekly tags and is the number of commits since the most recent weekly.
37+
38+
Project Metadata
39+
----------------
40+
41+
The ``project`` section is used to specify the core metadata for the package.
42+
For example, this is the content for the ``lsst-resources`` package:
43+
44+
.. code-block:: toml
45+
46+
[project]
47+
name = "lsst-resources"
48+
description = "An abstraction layer for reading and writing from URI file resources."
49+
license = {text = "BSD 3-Clause License"}
50+
readme = "README.md"
51+
authors = [
52+
{name="Rubin Observatory Data Management", email="[email protected]"},
53+
]
54+
classifiers = [
55+
"Intended Audience :: Developers",
56+
"License :: OSI Approved :: BSD License",
57+
"Operating System :: OS Independent",
58+
"Programming Language :: Python :: 3",
59+
"Programming Language :: Python :: 3.10",
60+
"Programming Language :: Python :: 3.11",
61+
]
62+
keywords = ["lsst"]
63+
dependencies = [
64+
"lsst-utils",
65+
"importlib_resources",
66+
]
67+
dynamic = ["version"]
68+
requires-python = ">=3.10.0"
69+
70+
The Rubin DM convention is that science pipelines packages that are to be distributed on `PyPI`_ should include the ``lsst-`` prefix in the name.
71+
This differs from the EUPS naming convention where the ``lsst`` is implicit.
72+
For example, the ``daf_butler`` EUPS package has a python distribution name of ``lsst-daf-butler``.
73+
Middleware packages use a dual license for distribution, with the `PyPI`_ package declaring the BSD 3-clause license.
74+
Most science pipelines packages use the GPLv3 license and for those packages the ``pyproject.toml`` should use:
75+
76+
.. code-block:: toml
77+
78+
license = {text = "GPLv3+ License"}
79+
classifiers = [
80+
"License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)",
81+
]
82+
83+
Every Rubin DM science pipelines package should be owned by the special DM account and that should always be included in the ``authors`` section.
84+
Additional authors can be included if required.
85+
86+
The ``dependencies`` section can only refer to packages that are available on `PyPI_` since this is the section that will be read during ``pip install``.
87+
88+
Setuptools Configuration
89+
------------------------
90+
91+
For ``setuptools`` builds additional configuration is needed so that the python files and data files can be located.
92+
For example, in ``daf_butler`` there is this configuration:
93+
94+
.. code-block:: toml
95+
96+
[tool.setuptools.packages.find]
97+
where = ["python"]
98+
99+
[tool.setuptools]
100+
zip-safe = true
101+
license-files = ["COPYRIGHT", "LICENSE", "bsd_license.txt", "gpl-v3.0.txt"]
102+
103+
[tool.setuptools.package-data]
104+
"lsst.daf.butler" = ["py.typed", "configs/*.yaml", "configs/*/*.yaml"]
105+
106+
[tool.setuptools.dynamic]
107+
version = { attr = "lsst_versions.get_lsst_version" }
108+
109+
This tells ``setuptools`` that the python files are in a ``python/`` directory and what additional non-python files should be included in the distribution.
110+
111+
The license-files section should reflect the specific needs of your package.
112+
113+
When making a `PyPI`_ distribution, the package should work without relying on the EUPS ``$PACKAGE_DIR`` variable being set.
114+
This means that any supplementary data such as those that would go in a ``config/`` or ``policy/`` directory should instead be included inside the ``python/`` directory and be accessed using the standard package resources APIs.
115+
These files must then be listed explicitly in the ``package-data`` section of the configuration file.
116+
117+
.. warning::
118+
119+
Currently ``pex_config`` does not understand how to read a config from a package using package resources.
120+
If configs are to be read they can not be read using the usual ``lsst.utils.getPackageDir`` API and must instead use `importlib.resources` APIs directly.
121+
We are planning to make this simpler by adding native support into ``pex_config``.
122+
123+
Using GitHub Actions
124+
====================
125+
126+
If a package is pip-installable it is likely that you will want to build the package in a GitHub action and run the associated tests.
127+
If your package depends on other science pipelines packages you will want to install those directly from GitHub from the ``main`` branch since there is no guarantee that `PyPI`_ will have the right version.
128+
The easiest way to do this is to write a ``requirements.txt`` file which has the direct dependencies that should be installed by the build script.
129+
This file is a simple text file listing packages and versions.
130+
131+
For example, the ``requirements.txt`` in the ``daf_relation`` package looks like:
132+
133+
.. code-block::
134+
135+
git+https://github.com/lsst/utils@main#egg=lsst-utils
136+
sqlalchemy >= 1.4
137+
138+
The first line tells ``pip`` to install the dependency directly from GitHub.
139+
The second line is a standard `PyPI`_ dependency.
140+
These can be installed by running:
141+
142+
.. code-block:: bash
143+
144+
$ pip install -r requirements.txt
145+
146+
and then the package can be installed with:
147+
148+
.. code-block:: bash
149+
150+
$ pip install --no-deps .
151+
152+
Where this will skip the dependency check and install the package directly.
153+
When developing multiple packages at the same time it is possible to change the ``requirements.txt`` file to point at a specific ticket branch rather than ``main``.
154+
There are checkers available that can block merging if such a change has been made.
155+
156+
If you want the version number of the build to be determined correctly the code must be checked out on GitHub with the full history included:
157+
158+
.. code-block:: yaml
159+
160+
steps:
161+
- uses: actions/checkout@v3
162+
with:
163+
# Need to clone everything for the git tags.
164+
fetch-depth: 0
165+
166+
Once a package is pip-installable the package can be tested in the GitHub action.
167+
If ``pytest`` is configured with code coverage enabled the results can be uploaded to CodeCov and reported on the pull request.
168+
This would look something like:
169+
170+
.. code-block:: yaml
171+
172+
- name: Build and install
173+
run: |
174+
python -m pip install --no-deps -v -e .
175+
176+
- name: Run tests
177+
run: |
178+
pytest -r a -v -n 3 --open-files --cov=lsst.resources\
179+
--cov=tests --cov-report=xml --cov-report=term --cov-branch
180+
181+
- name: Upload coverage to codecov
182+
uses: codecov/codecov-action@v2
183+
with:
184+
file: ./coverage.xml
185+
186+
187+
188+
Distributing the Package on PyPI
189+
================================
190+
191+
Once the package supports ``pip install`` it is a small configuration change to allow it to be distributed on `PyPI`_.
192+
One caveat is that all the required dependencies listed in the ``pyproject.toml`` file must exist on `PyPI`_.
193+
194+
The recommended process is for a GitHub action to trigger when the package is tagged.
195+
This action will then build the package and trigger the upload to `PyPI`_.
196+
All science pipeline packages on `PyPI`_ must be owned by the Rubin DM `PyPI`_ account attached to ``[email protected]``.
197+
198+
The `PyPI`_ upload can be configured in the same GitHub action that builds the package and tests it.
199+
Usually it will block on the successful completion of that phase and then only trigger if a tag is being added.
200+
201+
A full example can be seen below:
202+
203+
.. code-block:: yaml
204+
205+
pypi:
206+
207+
runs-on: ubuntu-latest
208+
needs: [build_and_test]
209+
if: startsWith(github.ref, 'refs/tags/')
210+
permissions:
211+
id-token: write
212+
213+
steps:
214+
- uses: actions/checkout@v3
215+
with:
216+
# Need to clone everything to embed the version.
217+
fetch-depth: 0
218+
219+
- name: Set up Python
220+
uses: actions/setup-python@v4
221+
with:
222+
python-version: "3.11"
223+
224+
- name: Install dependencies
225+
run: |
226+
python -m pip install --upgrade pip
227+
pip install --upgrade setuptools wheel build
228+
229+
- name: Build and create distribution
230+
run: |
231+
python -m build --skip-dependency-check
232+
233+
- name: Upload
234+
uses: pypa/gh-action-pypi-publish@release/v1
235+
236+
For the upload to work `PyPI`_ must be preconfigured to expect uploads from this specific GitHub action using a `trusted publisher`_ mechanism.
237+
238+
.. _PyPI: https://pypi.org
239+
.. _lsst-sphgeom: https://pypi.org/project/lsst-sphgeom/
240+
.. _lsst-versions: https://pypi.org/project/lsst-versions/
241+
.. _trusted publisher: https://docs.pypi.org/trusted-publishers/

0 commit comments

Comments
 (0)