Thank you for considering contributing to dlt! We appreciate your help in making dlt better. This document will guide you through the process of contributing to the project.
- Getting Started
- Submitting Changes
- Adding or updating core dependencies
- Linting
- Testing
- Local Development
- Publishing (Maintainers Only)
- Resources
-
Proposing significant changes or enhancements: If you're thinking about making significant changes, make sure to submit an issue first. This ensures your efforts align with the project's direction and that you don't invest time on a feature that may not be merged.
-
Fixing bugs:
- Check existing issues: search open issues to see if the bug you've found is already reported.
- If not reported, create a new issue. You're more than welcome to fix it and submit a pull request with your solution. Thank you!
- If the bug is already reported, please leave a comment on that issue stating you're working on fixing it. This helps keep everyone updated and avoids duplicate efforts.
- Check existing issues: search open issues to see if the bug you've found is already reported.
To get started, follow these steps:
- Fork the
dlt
repository and clone it to your local machine. - Install
poetry
withmake install-poetry
(or follow the official instructions). - Run
make dev
to install all dependencies including dev ones. - Start working in the
poetry
shell by executingpoetry shell
.
When you're ready to contribute, follow these steps:
- Create an issue describing the feature, bug fix, or improvement you'd like to make.
- Create a new branch in your forked repository for your changes.
- Write your code and tests.
- Lint your code by running
make lint
and test common modules withmake test-common
. - If you're working on destination code, contact us to get access to test destinations.
- Create a pull request targeting the devel branch of the main repository.
Note: for some special cases, you'd need to contact us to create a branch in this repository (not fork). See below.
We use devel (which is our default Github branch) to prepare a next release of dlt
. We accept all regular contributions there (including most of the bugfixes).
We use master branch for hot fixes (including documentation) that needs to be released out of the normal schedule.
On the release day, devel branch is merged into master. All releases of dlt
happen only from the master.
We want to make sure that our git history explains in a human readable way what has been changed with which Branch or PR. To this end, we are using the following branch naming pattern (all lowercase and dashes, no underscores):
{category}/{ticket-id}-description-of-the-branch
# example:
feat/4922-add-avro-support
- feat - a new feature that is being implemented (ticket required)
- fix - a change that fixes a bug (ticket required)
- exp - an experiment where we are testing a new idea or want to demonstrate something to the team, might turn into a
feat
later (ticket encouraged) - test - anything related to the tests (ticket encouraged)
- blogs - a new entry to our blog (ticket optional)
- docs - a change to our docs (ticket optional)
We encourage you to attach your branches to a ticket, if none exists, create one and explain what you are doing. For feat
and fix
branches, tickets are mandatory, for exp
and test
branches encouraged and for blogs
and docs
branches optional.
We'll fix critical bugs and release dlt
out of the schedule. Follow the regular procedure, but make your PR against master branch. Please ping us on Slack if you do it.
We enable our CI to run tests for contributions from forks. All the tests are run, but not all destinations are available due to credentials. Currently
only the duckdb
and postgres
are available to forks.
In case you submit a new destination or make changes to a destination that require credentials (so Bigquery, Snowflake, buckets etc.) you should contact us so we can add you as contributor. Then you should make a PR directly to the dlt
repo.
Our objective is to maintain stability and compatibility of dlt across all environments. By following these guidelines, we can make sure that dlt stays secure, reliable and compatible. Please consider the following points carefully when proposing updates to dependencies.
-
Critical security or system integrity updates only: Major or minor version updates to dependencies should only be considered if there are critical security vulnerabilities or issues that impact the system's integrity. In such cases, updating is necessary to protect the system and the data it processes.
-
Using the '>=' operator: When specifying dependencies, please make sure to use the
>=
operator while also maintaining version minima. This approach ensures our project remains compatible with older systems and setups, mitigating potential unsolvable dependency conflicts.
For example, if our project currently uses a package example-package==1.2.3
, and a security update is
released as 1.2.4
, instead of updating to example-package==1.2.4
, we can set it to example-package>=1.2.3,<2.0.0
. This permits the necessary security update and at the same time
prevents the automatic jump to a potentially incompatible major version update in the future.
The other important note on using possible version minimas is to prevent potential cases where package
versions will not be resolvable.
dlt
uses mypy
and flake8
with several plugins for linting.
dlt uses pytest
for testing.
To test common components (which don't require external resources), run make test-common
.
To test local destinations (duckdb
and postgres
), run make test-load-local
.
To test external destinations use make test
. You will need the following external resources
BigQuery
projectRedshift
clusterPostgres
instance. You can find a docker compose for postgres instance here. When run the instance is configured to work with the tests.
cd tests/load/postgres/
docker-compose up --build -d
See tests/.example.env
for the expected environment variables and command line example to run the tests. Then create tests/.env
from it. You configure the tests as you would configure the dlt pipeline.
We'll provide you with access to the resources above if you wish to test locally.
Use Python 3.8 for development, as it's the lowest supported version for dlt
. You'll need distutils
and venv
. You may also use pyenv
, as suggested by poetry.
This section is intended for project maintainers who have the necessary permissions to manage the project's versioning and publish new releases. If you're a contributor, you can skip this section.
Please read how we version the library first.
The source of truth for the current version is pyproject.toml
, and we use poetry
to manage it.
Before publishing a new release, make sure to bump the project's version accordingly:
- Check out the devel branch.
- Use
poetry version patch
to increase the patch version - Run
make build-library
to apply the changes to the project. - Create a new branch, and submit the PR to devel. Go through the standard process to merge it.
- Create a merge PR from
devel
tomaster
and merge it with a merge commit.
- Check out the master branch
- Use
poetry version patch
to increase the patch version - Run
make build-library
to apply the changes to the project. - Create a new branch, submit the PR to master and merge it.
Occasionally we may release an alpha version directly from the branch.
- Check out the devel branch
- Use
poetry version prerelease
to increase the alpha version - Run
make build-library
to apply the changes to the project. - Create a new branch, and submit the PR to devel and merge it.
Once the version has been bumped, follow these steps to publish the new release to PyPI:
- Ensure that you are on the master branch and have the latest code that has passed all tests on CI.
- Verify the current version with
poetry version
. - Obtain a PyPI access token and configure it with
poetry config pypi-token.pypi your-api-token
. - Run
make publish-library
to publish the new version. - Create a release on GitHub, using the version and git tag as the release name.
If you have any questions or need help, don't hesitate to reach out to us. We're here to help you succeed in contributing to dlt
. Happy coding!