Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(datasets): add dataset to load/save with Ibis #560
feat(datasets): add dataset to load/save with Ibis #560
Changes from 18 commits
2a38d58
7dfc850
19377c4
f5a6b73
c29e9e1
ee1dc38
0c0bc6b
9b8d2e8
bbec217
7b3ae33
1fcb01a
14ecc8c
31a23ad
a4f1d67
f085a46
19380d3
3bbfa3b
4757839
46fe6bd
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to add an local example here with some dummy dataframe / duckdb? I know there is a blog post, but just looking at the docs/docstring I think it's not that easy to get how an user should use this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we'd have both a
yaml
andpython
example.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just fyi, when connecting to ibis there are two major "paradigms", either a filebased connection or a db connection. From my limited knowledge I think the configuration arguments are quite different.
For reference, I am currently connecting to mssql dbs like this:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the documentation here should probably clarify what the different arguments map to in the ibis connection object.
in the case of a
filepath
being provided, it will map to the backend methodread_{file_format}
, e.g. https://ibis-project.org/backends/duckdb#ibis.backends.duckdb.Backend.read_parquetotherwise it will get the table from
ibis.{backend}.Backend.table
where the Backend object is obtained throughibis.{backend}.connect
e.g. https://ibis-project.org/backends/mssqlThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I forgot to add an example, it seems. 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added much more detail on this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any docs we can link to that specify which values a user can supply to
materialized
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not aware of anywhere it's documented centrally (i.e. not on a backend-specific basis) in Ibis. @lostmygithubaccount any chance you know?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we do not, this is on my TODO list to add API docs for alongside the other
read_*
andto_*
methodsThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good! @merelcht if it's OK then, we'll improve the documentation on the Ibis side, and then whenever that happens, we can make the dataset docs reference that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently when loading tables thru the
mssql
backend I am doingconn.table(table_name="XXX", schema="A").
an easy solution would be to add the load args into the
conn.table
call and specify the schema as aload_arg
, but imo this is suboptimal, as the write and save schema should usually (in my experience, my assertion could be wrong) be the sameThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gforsyth not sure if you have any insight here wrt the upcoming changes in ibis around schema and db hierarchy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so starting in 9.0,
schema
will be deprecated as a kwarg. We will be using the word "database" to refer to a collection of tables and the word "catalog" to refer to a collection of "database".For specifying only the
database
, you would do:if the database is located in a catalog (like, say
dbo
orsys
) you would do either of:or
(
schema
will still work in 9.0, it will just raise a FutureWarning)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could do a CI script to keep this bit in sync