-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add FileScanConfig::new()
API
#10623
Conversation
ec6fbb1
to
d7e0462
Compare
table_partition_cols: vec![], | ||
output_ordering: vec![], | ||
}; | ||
let scan_config = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pretty good example of what the new API allows for:
- Less code (defaults are set)
- Makes it clearer what is the default and what is not
/// PartitionedFile::new("file2.parquet", 56), | ||
/// PartitionedFile::new("file3.parquet", 78), | ||
/// ]); | ||
/// ``` | ||
#[derive(Clone)] | ||
pub struct FileScanConfig { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the new API and documentation. Note this is entirely backwards compatible (I did not change any fields to non pub, etc)
self | ||
} | ||
|
||
/// Add a file as a single group |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having to wrap a single file in a vec![vec![..]]
makes sense from the implementation point of view, but most users shouldn't have to worry about this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very clean, thank you! This is the first time I've seen both the with_<x>
and add_<x>
pattern used on a builder-like struct.
Is that something you generally do for all builder-type APIs, or it depends?
Thank you @phillipleblanc
I would say it depends. In this case I hadd a use case for However, I can see how we could do something else in this case, so perhaps I should remove the |
/// # use arrow_schema::Schema; | ||
/// use datafusion::datasource::listing::PartitionedFile; | ||
/// # use datafusion::datasource::physical_plan::FileScanConfig; | ||
/// # use datafusion_execution::object_store::ObjectStoreUrl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 maybe I should change this example to use the with style 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phillipleblanc upon more consideration I htink you are right that sticking to one pattern (e.g. builder) will be cleaner. I removed the add*
apis in f228af32d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the effort, this was a necessary API to introduce.
Thank you @metegenez |
Thanks everyone for the reviews! |
* Add FileScanConfig::new() API, update code to use new API * Remove add_* api
* Add FileScanConfig::new() API, update code to use new API * Remove add_* api
Which issue does this PR close?
Part of #10546
Rationale for this change
While working on #10549 it was cumbersome to create a
FileScanConfig
-- there are many different fields, many of which need default values which require some thought to figure out.The number of fields also makes it harder to see what is specifically set and what is the default
This has bothered me for a long time, so I made a PR to fix it.
You can see this in action here #10618
What changes are included in this PR?
Are these changes tested?
Covered by existing tests
Are there any user-facing changes?
Nicer and easier to use API for creating FileScanConfig