Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

master merge for 1.4.1 release #2112

Merged
merged 26 commits into from
Dec 2, 2024
Merged
Changes from 1 commit
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
0c3f660
Updated sql_database documentation for resource usage (#2072)
dat-a-man Nov 19, 2024
d5a3499
Docs: improve links visibility (#2078)
burnash Nov 20, 2024
6c71168
Docs: fix formatting of info block in Kafka docs (#2080)
burnash Nov 20, 2024
9a49868
edit snippet to make more runnable (#2066) (#2079)
AstrakhantsevaAA Nov 20, 2024
810e619
adds engine adapter and passes incremental and engine to query adapte…
rudolfix Nov 23, 2024
22800e3
Snowflake: remove unused imports (#2081)
burnash Nov 23, 2024
f7dc346
allow to select schema from pipeline dataset factory (#2075)
sh-rp Nov 23, 2024
6e0510a
Fixes the usage of escaped JSONPath in incremental cursors in sql_dat…
burnash Nov 23, 2024
dfde071
ibis support - hand over credentials to ibis backend for a number of …
sh-rp Nov 23, 2024
bfd0b52
azure account host docs (#2091)
rudolfix Nov 23, 2024
a150f56
Update paginator type from json_response to json_link (#2093)
burnash Nov 24, 2024
d5f6b47
Update (#2094)
dat-a-man Nov 25, 2024
bc25a60
data access documentation (#2006)
sh-rp Nov 25, 2024
f13e3f1
Support custom Ollama Host (#2044)
Pipboyguy Nov 25, 2024
c283cee
Move "dlt in notebooks" (#2096)
AstrakhantsevaAA Nov 26, 2024
d9cdc6c
docs: document that `path` can also be a URL (#2099)
joscha Nov 26, 2024
aa80667
Allow specifying custom auth in resources (#2082)
joscha Nov 27, 2024
6f146d1
Fix/2089 support sets for pyarrow backend (#2090)
karakanb Nov 27, 2024
58d9951
Docs: fix minor typo in ClickHouse (#2103)
jdbohrman Nov 27, 2024
da87edf
allow to increase total count on most progress bars, fixes incorrect …
sh-rp Nov 28, 2024
2078754
Docs: fix parquet layout example (#2105)
trymzet Nov 28, 2024
eefe77b
docs(rest_client): note about `data_selector` (#2101)
joscha Nov 28, 2024
09914a3
Support Spatial Types for PostGIS (#1927)
Pipboyguy Nov 30, 2024
61c2ed9
Incremental table hints and incremental in resource decorator (#2033)
steinitzu Nov 30, 2024
f4faa83
#2087 allows double underscores in identifiers (#2098)
rudolfix Dec 2, 2024
b4d807f
bumps to version 1.4.1
rudolfix Dec 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
edit snippet to make more runnable (#2066) (#2079)
* edit snippet to make more runnable

Tried running the example as is and ran into a few issues:
* dlt[parquet] missing dependency
* missing secret (removed since it's not used in this example)
* removed serialized flag (this is kind of an advanced feature and not really needed here)
* added minimal reflection level, otherwise if you run this script twice in a row you get a "column constraint not supported" kind of error

* move imports into snippet

Co-authored-by: Kenny Ning <[email protected]>
AstrakhantsevaAA and kning authored Nov 20, 2024
commit 9a498680701689d41602474a520d270d7cfa9a53
Original file line number Diff line number Diff line change
@@ -1,17 +1,16 @@
import os

import modal

from tests.pipeline.utils import assert_load_info


def test_modal_snippet() -> None:
# @@@DLT_SNIPPET_START modal_image
import modal

# Define the Modal Image
image = modal.Image.debian_slim().pip_install(
"dlt>=1.1.0",
"dlt[duckdb]", # destination
"dlt[sql_database]", # source (MySQL)
"dlt[parquet]", # file format dependency
"pymysql", # database driver for MySQL source
)

@@ -25,19 +24,19 @@ def test_modal_snippet() -> None:
@app.function(
volumes={"/data/": vol},
schedule=modal.Period(days=1),
secrets=[modal.Secret.from_name("sql-secret")],
serialized=True,
serialized=True
)
def load_tables() -> None:
import dlt
import os
from dlt.sources.sql_database import sql_database

# Define the source database credentials; in production, you would save this as a Modal Secret which can be referenced here as an environment variable
os.environ["SOURCES__SQL_DATABASE__CREDENTIALS"] = (
"mysql+pymysql://[email protected]:4497/Rfam"
)
# Load tables "family" and "genome"
source = sql_database().with_resources("family", "genome")
# Load tables "family" and "genome" with minimal reflection to avoid column constraint error
source = sql_database(reflection_level="minimal").with_resources("family", "genome")

# Create dlt pipeline object
pipeline = dlt.pipeline(
@@ -50,7 +49,7 @@ def load_tables() -> None:
)

# Run the pipeline
load_info = pipeline.run(source)
load_info = pipeline.run(source, write_disposition="replace")

# Print run statistics
print(load_info)