Here you'll find a contributing guide to get started with development.
For local development, it is required to have Python 3.9 (or a later version) installed.
We use Poetry for project management. Install it and set up your IDE accordingly.
To install this package and its development dependencies, run:
make install-dev
To execute all code checking tools together, run:
make check-code
We utilize ruff for linting, which analyzes code for potential issues and enforces consistent style. Refer to pyproject.toml
for configuration details.
To run linting:
make lint
Our automated code formatting also leverages ruff, ensuring uniform style and addressing fixable linting issues. Configuration specifics are outlined in pyproject.toml
.
To run formatting:
make format
Type checking is handled by mypy, verifying code against type annotations. Configuration settings can be found in pyproject.toml
.
To run type checking:
make type-check
We employ pytest as our testing framework, equipped with various plugins. Check pyproject.toml for configuration details and installed plugins.
We use pytest as a testing framework with many plugins. Check pyproject.toml
for configuration details and installed plugins.
To run unit tests:
make unit-tests
To run unit tests with HTML coverage report:
make unit-tests-cov
We adhere to the Google docstring format for documenting our codebase. Every user-facing class or method is documented. Documentation standards are enforced using Ruff.
Our API documentation is generated from these docstrings using pydoc-markdown with additional post-processing. Markdown files in the docs/
folder complement the autogenerated content. Final documentation is rendered using Docusaurus and published to GitHub Pages.
To run the documentation locally, you need to have Node.js version 20 or higher installed. Once you have the correct version of Node.js, follow these steps:
Navigate to the website/
directory:
cd website/
Enable Corepack, which installs Yarn automatically:
corepack enable
Build the API reference:
./build_api_reference.sh
Install the necessary dependencies:
yarn
Start the project in development mode with Hot Module Replacement (HMR):
yarn start
Or using make
:
make run-doc
Publishing new versions to PyPI is automated through GitHub Actions.
- Alpha releases: Alpha releases can be triggered manually via GitHub Actions, typically from a pull request or a dedicated branch. This allows for testing new features or changes before merging into the master branch. Alpha releases use the version number from
pyproject.toml
with an alpha suffix. - Beta releases: On each commit to the master branch, a new beta release is automatically published. The version number is sourced from
pyproject.toml
, with the beta version suffix incremented by 1 from the last beta release on PyPI. - Stable releases: A stable version is published when a new release is created using GitHub Releases. The version number is taken from
pyproject.toml
, and the built package assets are automatically uploaded to the GitHub release.
Important notes:
- Ensure the version number in
pyproject.toml
is updated before creating a new release. If a stable version with the same version number already exists on PyPI, the publish process will fail. - The release process also fails if the version is not documented in
CHANGELOG.md
. Make sure to describe the changes in the new version there. - After a stable release, ensure to increment the version number in both
pyproject.toml
andcrawlee/__init__.py
.
Before merging a pull request or committing directly to master for automatic beta release:
- Ensure
pyproject.toml
version reflects the latest non-published version. - Describe changes in
CHANGELOG.md
under the latest non-published version section.
Before creating a GitHub Release for production:
- Confirm successful deployment of the latest beta release with the latest commit.
- Ensure changes are documented in
CHANGELOG.md
since the last production release. - When drafting a GitHub release:
- Create a new tag like
v1.2.3
targeting the master branch. - Use
1.2.3
as the release title. - Copy changes from
CHANGELOG.md
into the release description. - Check "Set as the latest release" option for visibility.
- Create a new tag like
To release a new version manually, follow these steps. Note that manual releases should only be performed if you have a good reason, use the automated release process otherwise.
- Update the version number:
- Modify the
version
field undertool.poetry
inpyproject.toml
. - Update the
__version__
field incrawlee/__init__.py
.
[tool.poetry]
name = "crawlee"
version = "x.z.y"
- Generate the distribution archives for the package:
poetry build
- Set up the PyPI API token for authentication:
poetry config pypi-token.pypi YOUR_API_TOKEN
- Upload the package to PyPI:
poetry publish