25 Apr 05:50

rudolfix

efaedc2

0.4.9

Core Library

SCD2 support by @jorritsandbrink in #1168 https://dlthub.com/devel/general-usage/incremental-loading#scd2-strategy
A fully configurable layout for filesystem files by @sultaniman in #1182 https://dlthub.com/devel/dlt-ecosystem/destinations/filesystem#files-layout
picks file format matching item format to minimize number of rewrites during loading by @rudolfix in #1222
fix athena iceberg's trailing location by @romanperesypkin in #1230
Pass options to parse iso like strings by @VioletM in #1219
pipeline state can be restored from filesystem destination by @sh-rp in #1184 - https://dlthub.com/devel/dlt-ecosystem/destinations/filesystem#syncing-of-dlt-state
Remove staging-optimized replace strategy for synapse by @jorritsandbrink in #1231
fixes bug, where configs where not injected for async functions by @sh-rp in #1241
feat(transform): implement columns pivot map function by @IlyaFaer in #1152
Add max_table_nesting to resource decorator by @sultaniman in #1242
adds csv options to write headers, change delimiter, quotation style by @rudolfix in #1239
Check for default schema and schema name in streamlit session by @sultaniman in #1155
Add seconds and millisecond timestamps to filesystem date placeholders by @sultaniman in #1260
send dlt telemetry wherever you want, not only segment by @zem360 in #1236
Make merge write-disposition fall back to staging append if no primary or merge keys are specified by @sh-rp in #1225
Add snowflake application parameter to configuration by @sultaniman in #1266

Docs

Added docs for deploying dlt with Prefect. by @dat-a-man in #1138
a note on scd2 incoming high ts change by @rudolfix in #1273
adding images and wordsmithing to Prefect walkthrough by @WillRaphaelson in #1276

Verified Sources

Use pyarrow, pandas, connectorx or sqlalchemy backends when reading tables with sql_database. See README for details. dlt-hub/verified-sources#425
Google ads source is available dlt-hub/verified-sources#428
Pages endpoint for notion dlt-hub/verified-sources#429

New Contributors

@romanperesypkin made their first contribution in #1230
@WillRaphaelson made their first contribution in #1276

Full Changelog: 0.4.8...0.4.9

Contributors

sultaniman, sh-rp, and 8 other contributors

Assets 2

19 Apr 08:34

rudolfix

0.4.9a2

902963c

0.4.9a2 Pre-release

Pre-release

A pre-release that allows to try out the following features and includes the following bugfixes:

SCD2 support by @jorritsandbrink in #1168 We are still working on BigQuery support) https://dlthub.com/devel/general-usage/incremental-loading#scd2-strategy
A fully configurable layout for filesystem files by @sultaniman in #1182 https://dlthub.com/devel/dlt-ecosystem/destinations/filesystem#files-layout
picks file format matching item format by @rudolfix in #1222
fix athena iceberg's trailing location by @romanperesypkin in #1230
Pass options to parse iso like strings by @VioletM in #1219
filesystem state sync by @sh-rp in #1184 - https://dlthub.com/devel/dlt-ecosystem/destinations/filesystem#syncing-of-dlt-state
Remove staging-optimized replace strategy for synapse by @jorritsandbrink in #1231
fixes bug, where configs where not injected for async functions by @sh-rp in #1241
adds options to write csv headers, change delimiter by @rudolfix in #1239

Final release is scheduled for next week

Contributors

sultaniman, sh-rp, and 4 other contributors

Assets 2

09 Apr 13:55

rudolfix

0.4.8

c99d612

0.4.8

Core Library

Add Dremio as a destination by @maxfirman in #1026
adds a fast loading of arrow tables/pandas to postgres via COPY csv by @rudolfix in #1185
adds a csv writer for filesystem and postgres by @rudolfix in #1185
saves parquet with all logical types, spark flavor is not a default any longer by @rudolfix in #1185
#1185
feat(bigquery): add streaming inserts support by @IlyaFaer in #1123
Feat: parameterize pipeline class in the primary factory method by @z3z1ma in #1176
Fix: check for typeddict before class or subclass checks which fail by @z3z1ma in #1160
fixes column order and add hints table variants by @rudolfix in #1127
fixes schema versioning by @rudolfix in #1140
regular initializers for credentials / config specs are type checked like dataclasses by @rudolfix in #1142
fix streamlit app state display: Add yaml representer for pendulum datetime by @sultaniman in #1192
synapse and mssql bugfixes and improvements (INSERT VALUES UNION) by @jorritsandbrink in #1174
various improvements to arrow table normalization by @rudolfix in #1185
arrow tables without rows create tables in destination by @rudolfix in #1185
fixes Motherduck configuration to use my_db default database and makes password / token mandatory by @rudolfix in

Docs

docs: add typechecking to embedded snippets by @sh-rp in #1130
Fix typo with switched column names in schema evolution docs page by @b-per in #1132
Docs: deploy with Kestra by @dat-a-man in #1087
Docs: Deploy dlt on dagster by @dat-a-man in #1086
Update example connection string by @MiConnell in #1188
Changed directory of all the blog images to google cloud storage. by @dat-a-man in #1156

Verified Sources

postgres replication / CDC by @jorritsandbrink dlt-hub/verified-sources#392

New Contributors

@b-per made their first contribution in #1132
@MiConnell made their first contribution in #1188
@maxfirman made their first contribution in #1026

Full Changelog: 0.4.7...0.4.8

Contributors

sultaniman, sh-rp, and 8 other contributors

Assets 2

22 Mar 07:31

rudolfix

0.4.7

be12a1c

0.4.7

Core Library

Custom destinations with @dlt.destination decorator by @sh-rp in #1065
A BigQuery custom destination supporting STRUCT data types by @sh-rp in #1107
Built-in Streamlit rewrite, UI improvements, dark theme a by @sultaniman in #1060
fixes various edge cases with Incremental data deduplication, for ordered and unordered results #971 by @rudolfix in #1062
Adds new dlt.mark marker to materialize table schemas without data by @rudolfix in #1122
validates class instances in typed dict by @rudolfix in #1082
feat(airflow): allow re-using sources in airflow wrapper by @IlyaFaer in #1080
feat(core): drop default value for write disposition by @IlyaFaer in #1057
splits pandas and arrow imports to fix pyarrow.compute missing by @rudolfix in #1112
improve no schema upgrade path exception by @sh-rp in #1125

Docs

docs(airflow): add description of new decompose methods by @IlyaFaer in #1072
check embedded code blocks by @sh-rp in #1093
docs(kafka): describe the possible sync issues by @IlyaFaer in #1100
Docs: schema evolution by @dat-a-man in #1078
Add example link to the custom destination page by @VioletM in #1120

Full Changelog: 0.4.6...0.4.7

Contributors

sultaniman, sh-rp, and 4 other contributors

Assets 2

06 Mar 08:03

rudolfix

0.4.6

1957384

0.4.6

Core Library

feat(airflow): expose the Airflow runner method to create custom DAGs by @IlyaFaer in #1014
removes sql alchemy dependency and port parts of URL class by @rudolfix in #1028
Parallelize decorator - run many regular generators in parallel by @steinitzu in #965
Add main entry point to support calling dlt as python module by @sultaniman in #1023

Library Bugfixes

fixes naive datetime bug in incremental by @rudolfix in #1020
Import missing pyarrow compute for transforms on arrowitems by @sh-rp in #1010
delete normalized package in case it already existed by @sh-rp in #1012
fix(core): validation error with TTableHintTemplate by @IlyaFaer in #1039
adds test case where payload data contains PUA unicode characters by @willi-mueller in #1053
fix add_limit behavior in edge cases by @sh-rp in #1052
adds row_order to Incremental - automatically stop taking data when out of range by @rudolfix in #1041
Fix to serialize load metrics as list instead of a dictionary by @sultaniman in #1051
fix import schema workflow by @sh-rp in #1013
rollback all changes to live schemas when extraction fails by @sh-rp in #1013

Docs

Fix zendesk example test by @VioletM in #1027
Edit arrow-pandas.md and fix a typo by @Bl3f in #1001
Added info about file compression to filesystem docs by @dat-a-man in #975
Update "create destination" docs with new file layouts by @steinitzu in #1032
Docs update on how to set query limits. by @dat-a-man in #973
Docs/Updated for slack alerts. by @dat-a-man in #1042

Verified Sources

scrape web sites with spiders and Scrapy and send data to dlt @sultaniman dlt-hub/verified-sources#332
sql_database recoginizes end_value and row_order to return rows in range and optionally ordered. backfill and proper Airflow intervals support @rudolfix dlt-hub/verified-sources#388

New Contributors

@Bl3f made their first contribution in #1001

Full Changelog: 0.4.5...0.4.6

Contributors

willi-mueller, sultaniman, and 7 other contributors

Assets 2

26 Feb 22:30

rudolfix

0.4.5

d6c93fe

0.4.5

Core Library

enables google drive filesystem for sources and destinations (second one experimental, google drive listings are only eventually consistent!) by @IlyaFaer in #932
creates parallel Airflow DAGs in airflow helper to allow many resources to be executed at once @IlyaFaer in #966
855 create bigquery adapter for dlt resources: easily configure partitions, clustering, data retention etc. by @Pipboyguy in #952 and https://dlthub.com/docs/dlt-ecosystem/destinations/bigquery#bigquery-adapter
Use BIGNUMERIC for large decimals in bigquery by @steinitzu in #984
Normalize keys for Google secrets config provider by @sultaniman in #963
does not lowercase postgres and redshift database names by @rudolfix in #990
Introduce hard_delete and dedup_sort columns hint for merge by @jorritsandbrink in #960 and https://dlthub.com/docs/general-usage/incremental-loading#delete-records
adjustment of pua start in typed json encoding, pass through on decoding errors by @rudolfix in #974
creates isolated parallel Airflow DAGs in airflow helper to execute resources parallel in isolated pipelines @IlyaFaer in #979
Fix annotation processing and rebuilding, mark dataclass as complex by @sultaniman in #980
allows async functions to be decorated with dlt.source by @rudolfix in #985
allows right pipe operator to feed simple lists into a transformer @rudolfix in #985
allows pendulum datetime as incremental cursor when loading arrow tables @rudolfix in #985
enables Python 3.12 (mind that not all extras have python 3.12 libraries!) @rudolfix in #985

Docs

docs(filesystem): include Google Drive into filesystem tutorial by @IlyaFaer in #962
Fix typos/grammar in tutorial docs by @taljaards in #972
add blog post observability by @adrianbr in #989
Update arrow-pandas.md by @snehangsude in #992
Clarify info about GoodData in modelling tools article by @mhauzirek in #956
Fix small typings in contributing guide by @VioletM in #993
Docs/google sheets update by @dat-a-man in #976
Added "Incremental Configuration" section to SQL Databases documentat… by @dat-a-man in #977

Verified Sources

Bing Webmaster source by @willi-mueller

New Contributors

@taljaards made their first contribution in #972
@mhauzirek made their first contribution in #956
@snehangsude made their first contribution in #992
@VioletM made their first contribution in #993

Full Changelog: 0.4.4...0.4.5

Contributors

willi-mueller, sultaniman, and 11 other contributors

Assets 2

11 Feb 23:47

rudolfix

0.4.4

f1633e5

0.4.4

Core Library

passes incremental from apply hints to resource function by @rudolfix in #953
Handle UnionType when checking is_union_type and is_optional_type by @sultaniman in #951
yanks orjson to <=0.3.10 by @rudolfix in #958

Docs

Databricks workspace setup docs by @steinitzu in #949

Verified Source

allows for table reflection at runtime, column selection and buffer control in sql_database @rudolfix (dlt-hub/verified-sources#351)

Full Changelog: 0.4.3...0.4.4

Contributors

sultaniman, steinitzu, and rudolfix

Assets 2

07 Feb 19:20

rudolfix

0.4.3

1da9331

0.4.3

Core Library

Databricks destination by @steinitzu and @phillem15 in #892
Synapse destination by @jorritsandbrink in #900
BigQuery Partitioning Improvements by @Pipboyguy in #887
enable async generators as resources by @sh-rp in #905
fix: use truthy value in ternary since 0 cause div by zero by @z3z1ma in #902
feat(filesystem): add compression flag if the read file is GZ by @IlyaFaer in #912
Enhancements in Filesystem Configuration by @Pipboyguy in #869
add mark function to emit resource hints from decorated function by @rudolfix in #938
handles nested Pydantic models when generating dlt schema by @sultaniman in #901

Docs

Restructure intro, getting started and tutorial by @burnash in #702
Update the release instructions in CONTRIBUTING.md by @burnash in #867
Add explicit sub section about streamlit under getting started by @sultaniman in #884
Examples: google sheets by @AstrakhantsevaAA in #846
Added URL-parser documentation by @dat-a-man in #909

Verified Sources

feat(filesystem): implement a csv reader with duckdb engine @IlyaFaer dlt-hub/verified-sources#319
fix(notion): define payload within the while-loop @glebzhidkov (dlt-hub/verified-sources#338)
sql alchemy + connector x example @rudolfix (dlt-hub/verified-sources#334)
Shopify: Standalone resource for partner API queries @steinitzu (dlt-hub/verified-sources#329)
sql-database: detect precision and scale of supported column types @steinitzu (dlt-hub/verified-sources#324)
feat(sources.kafka): implement Kafka source @IlyaFaer (dlt-hub/verified-sources#306)

New Contributors

@Pipboyguy made their first contribution in #869
@sultaniman made their first contribution in #883

Full Changelog: 0.4.2...0.4.3

Contributors

burnash, sultaniman, and 11 other contributors

Assets 2

29 Dec 21:04

burnash

0.4.2

3d13835

0.4.2

Core Library

Fix the data type used in the from_db_type() method from MsSqlTypeMapper by @jorritsandbrink in #863
Use Secret Manager in CI by @AstrakhantsevaAA in #859
Move destination adapters to dlt.destination.adapters by @rudolfix in #854

Docs

Improve HubSpot source docs by @IlyaFaer in #864
Add new topic to docs: Destination; improve Configuration docs by @rudolfix in #861

Full Changelog: 0.4.1...0.4.2

Contributors

rudolfix, AstrakhantsevaAA, and 2 other contributors

Assets 2

23 Dec 12:52

rudolfix

0.4.1

84816c5

0.4.1

Major release

This is a major dlt release (as per our semantic versioning https://github.com/dlt-hub/dlt?tab=readme-ov-file#adding-as-dependency). It brings several interesting new features like: schema evolution control, data contracts, deeper Pydantic integration, parametrized destinations, improvements to parallelism and data lineage + many more

There are no significant breaking changes, but minor ones exist, please refer to #763 for details

Core Library

Parametrized destinations - import destinations from dlt.destinations module and instantiate them: by @steinitzu in #746
schema and data contracts by @sh-rp in #594
load package id in extract step by @rudolfix in #790
named destinations: configure many destinations with different names by @sh-rp in #783
rich tracing information from pipeline steps (extract, normalize, load) by @rudolfix in #801
adds exception stack to pipeline trace by @rudolfix in #806
fixed attribute check: getuid -> geteuid by @jorritsandbrink in #823
allows to run parallel pipelines in separate threads by @rudolfix in #813
791 test mssql credentialspy is odbc driver 18 dependent by @jorritsandbrink in #834
adds extract and normalize traces by @rudolfix in #839

Plus some tooling changes

introduce black formatting by @sh-rp in #583
Fix: ensure accessor typing does not make static type checker error by @z3z1ma in #785
Hot fix: add skipifgithubfork to nested_data example by @AstrakhantsevaAA in #802
Fix Windows lint issue and implement CI lint matrix strategy by @jorritsandbrink in #827

Docs

documents schema and data contract by @rudolfix in #782
Added Kinesis documentation. by @dat-a-man in #804
788 clarify docs intro by @deanja in #797
Fix links to source code by @AstrakhantsevaAA in #805
Clarify docs dev process by @deanja in #809
Qdrant ingestion pipeline example eg by @hibajamal in #775
Personio doc: added more endpoints by @AstrakhantsevaAA in #829

New Contributors

@deanja made their first contribution in #797
@IlyaFaer made their first contribution in #820
@jorritsandbrink made their first contribution in #823

Full Changelog: 0.3.25...0.4.1

Contributors

steinitzu, sh-rp, and 8 other contributors

Assets 2

Releases: dlt-hub/dlt

0.4.9

Core Library

Docs

Verified Sources

New Contributors

Contributors

0.4.9a2

A pre-release that allows to try out the following features and includes the following bugfixes:

Contributors

0.4.8

Core Library

Docs

Verified Sources

New Contributors

Contributors

0.4.7

Core Library

Docs

Contributors

0.4.6

Core Library

Library Bugfixes

Docs

Verified Sources

New Contributors

Contributors

0.4.5

Core Library

Docs

Verified Sources

New Contributors

Contributors

0.4.4

Core Library

Docs

Verified Source

Contributors

0.4.3

Core Library

Docs

Verified Sources

New Contributors

Contributors

0.4.2

Core Library

Docs

Contributors

0.4.1

Major release

Core Library

Docs

New Contributors

Contributors