Releases: dlt-hub/dlt
0.4.1a2
🧪 pre-release of 0.4.x (do not use in production)
- fixed attribute check: getuid -> geteuid by @jorritsandbrink in #823
- allows to run parallel pipelines in separate threads by @rudolfix in #813
- parallel pipelines docs update: https://dlthub.com/devel/reference/performance#running-several-pipelines-in-parallel-in-single-process
New Contributors
- @IlyaFaer made their first contribution in #820
- @jorritsandbrink made their first contribution in #823
Full Changelog: 0.4.1a1...0.4.1a2
0.4.1a1
🧪 pre-release of 0.4.x (do not use in production)
load_id
is generated in extract step and carried till the end to improve data lineage by @rudolfix in #790- added destination names, environment and ability to configure them by custom name by @sh-rp in #783
- step info (extract, normalize, load) contain list of load packages in traces by @rudolfix in #801
- adds exception traces to run trace by @rudolfix in #806
consult #763 for a list of major changes compared to 0.3.x version
Full Changelog: 0.4.1a0...0.4.1a1
0.4.1a0
🧪 pre-release of 0.4.x (do not use in production)
- Parametrized destinations by @steinitzu in #746
- schema contract by @sh-rp in #594
- source and schema changes by @sh-rp in #769
- introduce black formatting by @sh-rp in #583
- docs updates by @sh-rp in #784
- documents schema and data contract by @rudolfix in #782
- Fix: ensure accessor typing does not make static type checker error by @z3z1ma in #785
- prototype platform connection by @sh-rp in #727
Full Changelog: 0.3.24...0.4.1a0
0.3.25
Core Library
- Add authenticator to SnowflakeCredentials class by @gjdevincentis in #734
- Set port correctly in mssql connection string by @steinitzu in #731
- Empty rows fix by @steinitzu in #745
- Nit: Remove py.typed file that makes Pyright incessant by @z3z1ma in #732
- adds more name hashes to telemetry by @rudolfix in #764
- Autodetector for ISO date strings by @codingcyclist in #767
- pipeline: drop pending packages by @rudolfix in #771
Docs
- Copy improvements in the SQL Database verified source by @anuunchin in #749
- Docs: add Examples contributing doc by @AstrakhantsevaAA in #743
- Example: nested mongo data by @AstrakhantsevaAA in #737
- Documents how to use dbt wrapper without pipeline by @rudolfix in #733
New Contributors
- @gjdevincentis made their first contribution in #734
- @anuunchin made their first contribution in #749
Full Changelog: 0.3.24...0.3.25
0.3.24
Core Library
- Show many schemas in embedded streamlit app by @deeplook in #690
- Qdrant destination support by @Anush008 in #724
New Contributors
- @hibajamal made their first contribution in #728
- @deeplook made their first contribution in #690
- @Anush008 made their first contribution in #724
Full Changelog: 0.3.23...0.3.24
0.3.23
Core Library
- Replace multiprocessing pool with futures executors. Now you can run multi-thread code on Lambda (no more semaphore problems) by @steinitzu in #719
- Support all writer types in parquet normalizer. You can load Arrow/Pandas to any destination by @steinitzu in #704
- Add
_dlt_load_id
and_dlt_id
when loading Arrow/Pandas. @steinitzu in #704 - Pass Arrow/Pandas as data to run method. No need to wrap them in resources. by @rudolfix in #723
Docs
- filesystem/bucket verified source docs by @dat-a-man in #712
- in depth explanation of
dlt
config and secrets! by @AstrakhantsevaAA in #703 - new example: running chess pipeline in production with retries, trace saving and slack messages by @AstrakhantsevaAA in #711
Sources
- Simple, incremental Kinesis reader by @sehnem dlt-hub/verified-sources#276
Full Changelog: 0.3.22...0.3.23
0.3.22
Core Library
- Fix: make Load single-thread compatible by @codingcyclist in #698
- Enable compatible s3 storages like R2: support aws config
endpoint_url
with fsspec by @steinitzu in #701 - performance improvements in Arrow loading by @rudolfix in #707
- datatype autodetection for unix timestamp removed from defaults by @rudolfix in #707
Docs
Full Changelog: 0.3.21...0.3.22
0.3.21
What's Changed
-
athena iceberg by @sh-rp in #659
Use new table hint table format to store selected Athena tables in iceberg format (https://dlthub.com/docs/dlt-ecosystem/destinations/athena#iceberg-data-tables) -
Pyarrow direct loading by @steinitzu and @tomsej in #679
Allows to pass Arrow tables and Panda frames to therun
method and load them directly (via parquet) without data copy (https://dlthub.com/docs/dlt-ecosystem/verified-sources/arrow-pandas) which should result in immense speedups in many loading cases -
Features/dbt cloud by @AstrakhantsevaAA in #694
Run dbt jobs in the cloud (https://dlthub.com/docs/dlt-ecosystem/transformations/dbt/dbt_cloud) -
enables duckdb 0.9.1 and improves motherduck docs by @rudolfix in #695
Please also read Motherduck updated documentation (https://dlthub.com/docs/dlt-ecosystem/destinations/motherduck) - you may want to reduce load parallelism if you are on weak internet connection -
allows to provide custom implementation of
DltSource
to the source decorator by @rudolfix in #687
Bugfixes
- change source schema handling and normalizer
root_key
propagation. fixes various problems where merge and replace write dispositions were subsequently used in the same pipeline by @sh-rp in #686 - fixes bug in drop command by @sh-rp in #693
Docs
Full Changelog: 0.3.19...0.3.21
0.3.19
What's Changed
- easy renaming of resources by @rudolfix in #671
- standalone resources and transformers intended to be used outside of the source by @rudolfix in #671
- Building blocks for reading and manipulating files in buckets available in
dlt.sources.filesystem
by @rudolfix in #671
New dlt sources
- Read files from buckets, stream large json, csv, parquet and other files - also incrementally
- Read messages and atachments from e-mail inbox
Docs
- Code Examples for docs by @sh-rp in #616
- Holistic integration blogpost by @zem360
- Improved sources documentation by @AstrakhantsevaAA @dat-a-man
Bugfixes
- Update min snowflake-connector version by @steinitzu in #664
- Fix: read pkey as DER format, not PEM by @codingcyclist in #680
New Contributors
Full Changelog: 0.3.18...0.3.19
0.3.18
Core Library
- Support for precision, scale in column schema by @steinitzu in #646
- validation of extract data with Pydantic models by @steinitzu in #638
- moves fsspec support to common code by @rudolfix in #626
- Allow base64 encoded private keys for Snowflake destination by @codingcyclist in #637
- duck case naming convention that allows for emojis and other special characters in identifiers by @rudolfix in #660
- Typing fixes and enable mypy in tests by @steinitzu in #661
Bugfixes
- Allows table and resource names like
state
by fixing set attribute bug by @steinitzu in #657 - fixes
dlt pipeline show
streamlit app start-up by @rudolfix in #645 - replaces
depends_on
withdata_from
in resource decorator @rudolfix in #645 - fixes json logger reinit and drop json-logger dependency @rudolfix in #645
- forces local duckdb version when creating dbt runner venv to prevent storage version clashes @rudolfix in #645
- detects when motherduck does not support local duckdb version @rudolfix in #645
Docs
- Rework 'Understanding the tables' by @burnash in #629
- Added Airtable docs by @dat-a-man in #635
dlt
API Reference by @AstrakhantsevaAA in #642- Add a Zendesk to Weaviate walkthrough by @burnash in #641
- Add a blog post: Load Zendesk tickets to Verba by @burnash in #654
- Added Slack Docs! by @dat-a-man in #643
Full Changelog: 0.3.17...0.3.18