From f60c2886485e70b4a9a2ad76ba4036544a39dd48 Mon Sep 17 00:00:00 2001 From: Dylan Date: Mon, 9 Sep 2024 16:34:50 +0800 Subject: [PATCH] Merge upstream (#1) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * feat: Add website layout (#130) * feat: Add website layout Signed-off-by: Xuanwo * publish to rust.i.a.o Signed-off-by: Xuanwo * Fix license Signed-off-by: Xuanwo * Let's try mdbook action Signed-off-by: Xuanwo * use cargo install Signed-off-by: Xuanwo * disable section Signed-off-by: Xuanwo * Add docs for website Signed-off-by: Xuanwo * Fix license Signed-off-by: Xuanwo * action approved Signed-off-by: Xuanwo --------- Signed-off-by: Xuanwo * feat: Expression system. (#132) * feat: Expressions * Fix comments * Refactor expression to be more similar to iceberg model * Fix typo * website: Fix typo in book.toml (#136) Signed-off-by: Xuanwo * Set ghp_path and ghp_branch properties (#138) * chore: Upgrade toolchain to 1.75.0 (#140) * feat: Add roadmap and features status in README.md (#134) * feat: Add roadmap and features status in README.md * Fix * Fix * Add more details according to comments * Revert unnecessary new line break * Nits --------- Co-authored-by: Fokko Driesprong * Infra: Remove `publish:` section from `.asf.yaml` (#141) * chore(deps): Bump peaceiris/actions-gh-pages from 3.9.2 to 3.9.3 (#143) Bumps [peaceiris/actions-gh-pages](https://github.com/peaceiris/actions-gh-pages) from 3.9.2 to 3.9.3. - [Release notes](https://github.com/peaceiris/actions-gh-pages/releases) - [Changelog](https://github.com/peaceiris/actions-gh-pages/blob/main/CHANGELOG.md) - [Commits](https://github.com/peaceiris/actions-gh-pages/compare/v3.9.2...v3.9.3) --- updated-dependencies: - dependency-name: peaceiris/actions-gh-pages dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): Update opendal requirement from 0.43 to 0.44 (#142) Updates the requirements on [opendal](https://github.com/apache/incubator-opendal) to permit the latest version. - [Release notes](https://github.com/apache/incubator-opendal/releases) - [Changelog](https://github.com/apache/incubator-opendal/blob/main/CHANGELOG.md) - [Commits](https://github.com/apache/incubator-opendal/compare/v0.43.0...v0.43.0) --- updated-dependencies: - dependency-name: opendal dependency-type: direct:production ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * docs: Change homepage to rust.i.a.o (#146) * feat: Introduce basic file scan planning. (#129) * Code complete * Resolve * Done * Fix comments * Fix comments * Fix comments * Fix * Fix comment * chore: Update contributing guide. (#163) * chore: Update reader api status (#162) * chore: Update reader api status * Restore unnecessary change * #154 : Add homepage to Cargo.toml (#160) * Add formatting for toml files (#167) * Add formatting for toml files * Update call to taplo * Add command to format and a command to check * chore(deps): Update env_logger requirement from 0.10.0 to 0.11.0 (#170) Updates the requirements on [env_logger](https://github.com/rust-cli/env_logger) to permit the latest version. - [Release notes](https://github.com/rust-cli/env_logger/releases) - [Changelog](https://github.com/rust-cli/env_logger/blob/main/CHANGELOG.md) - [Commits](https://github.com/rust-cli/env_logger/compare/v0.10.0...v0.10.2) --- updated-dependencies: - dependency-name: env_logger dependency-type: direct:production ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * feat: init file writer interface (#168) * init file writer interface * refine --------- Co-authored-by: ZENOTME * fix: Manifest parsing should consider schema evolution. (#171) * fix: Manifest parsing should consider schema evolution. * Fix ut * docs: Add release guide for iceberg-rust (#147) * fix: Ignore negative statistics value (#173) * feat: Add user guide for website. (#178) * Add * Fix format * Add license header * chore(deps): Update derive_builder requirement from 0.12.0 to 0.13.0 (#175) Updates the requirements on [derive_builder](https://github.com/colin-kiegel/rust-derive-builder) to permit the latest version. - [Release notes](https://github.com/colin-kiegel/rust-derive-builder/releases) - [Commits](https://github.com/colin-kiegel/rust-derive-builder/compare/v0.12.0...v0.12.0) --- updated-dependencies: - dependency-name: derive_builder dependency-type: direct:production ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Replace unwrap (#183) * feat: add handwritten serialize (#185) * add handwritten serialize * revert expect * remove expect * Fix avro schema names for manifest and manifest_list (#182) Co-authored-by: Fokko Driesprong * feat: Bump hive_metastore to use pure rust thrift impl `volo` (#174) * feat: Bump version 0.2.0 to prepare for release. (#181) * feat: Bump version 0.2.0 to prepare for release. * Update dependencies * fix: `default_partition_spec` using the `partion_spec_id` set (#190) * add unit tests * fix type * Docs: Add required Cargo version to install guide (#191) * chore(deps): Update opendal requirement from 0.44 to 0.45 (#195) Updates the requirements on [opendal](https://github.com/apache/opendal) to permit the latest version. - [Release notes](https://github.com/apache/opendal/releases) - [Changelog](https://github.com/apache/opendal/blob/main/CHANGELOG.md) - [Commits](https://github.com/apache/opendal/compare/v0.44.0...v0.44.2) --- updated-dependencies: - dependency-name: opendal dependency-type: direct:production ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Smooth out release steps (#197) Couple of small things: - The license check failed because the `dist/*` files were there - Add `dist/*` to gitignore since we don't want to push these files to the repo - Make `scripts/release.sh` executable - Align the svn structure with PyIceberg and Java * refactor: remove support of manifest list format as a list of file path (#201) * refactor: remove support of manifest list format as a list of file paths#158 * refactor: add field definition to manifest list * refactor: delete duplicated function * refactor: fix duplicate function name * refactor: remove unwraps (#196) * remove avro unwraps * rm unwrap in schema manifest * rm some expects * rm types * fix clippy * fix string format * refine some unwrap * undo schema.rs * Fix: add required rust version in cargo.toml (#193) * Fix: add required rust version in cargo.toml * added rust-version to workspace=true in package * Fix the REST spec version (#198) This number indicates from which release the code was generated. For example, currently new endpoints are added to the spec, but they are not supported by iceberg-rust yet. * feat: Add Sync + Send to Catalog trait (#202) * feat: Make thrift transport configurable (#194) * feat: make transport configurable (#188) * implement default for HmsThriftTransport * Add UnboundSortOrder (#115) * Add UnboundSortOrder * Separate build methods for bound and unbound * Use a constant for unsorted order_id * ci: Add workflow for publish (#218) * ci: Add workflow for publish Signed-off-by: Xuanwo * Fix publish Signed-off-by: Xuanwo --------- Signed-off-by: Xuanwo * ci: add workflow for cargo audit (#217) * docs: Add basic README for all crates (#215) * docs: Add basic README for all crates Signed-off-by: Xuanwo * Remove license Signed-off-by: Xuanwo * Update links Signed-off-by: Xuanwo --------- Signed-off-by: Xuanwo * Follow naming convention from Iceberg's Java and Python implementations (#204) * doc: Add download page (#219) * doc: Add download page * Fix links * chore(deps): Update derive_builder requirement from 0.13.0 to 0.20.0 (#203) Updates the requirements on [derive_builder](https://github.com/colin-kiegel/rust-derive-builder) to permit the latest version. - [Release notes](https://github.com/colin-kiegel/rust-derive-builder/releases) - [Commits](https://github.com/colin-kiegel/rust-derive-builder/compare/v0.13.0...v0.13.1) --- updated-dependencies: - dependency-name: derive_builder dependency-type: direct:production ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * test: add FileIO s3 test (#220) * add file io s3 test * add license * fixed version & rm port scanner * ci: Ignore RUSTSEC-2023-0071 for no actions to take (#222) * ci: Ignore RUSTSEC-2023-0071 for no actions to take Signed-off-by: Xuanwo * Fix license header Signed-off-by: Xuanwo --------- Signed-off-by: Xuanwo * feat: Add expression builder and display. (#169) * feat: Add expression builder and display. * Fix comments * Fix doc test * Fix name of op * Fix comments * Fix timestamp * chord: Add IssueNavigationLink for RustRover (#230) * chord: IssueNavigationLink for RustRover * move to .idea * add apache license --------- Co-authored-by: fuqijun * minor: Fix `double` API doc typo (#226) * feat: add `UnboundPredicate::negate()` (#228) Issue: #150 * fix: Remove deprecated methods to pass ci (#234) * Implement basic Parquet data file reading capability (#207) * feat: TableScan parquet file read to RecordBatch stream * chore: add inline hinting and fix incorrect comment * refactor: extract record batch reader * refactor: rename `FileRecordBatchReader` to `ArrowReader` * refactor: rename file_record_batch_reader.rs to arrow.rs * refactor: move `batch_size` param to `TableScanBuilder` * refactor: rename `TableScan.execute` to `to_arrow` * refactor: use builder pattern to create `ArrowReader` * chore: doc-test as a target (#235) * feat: add parquet writer (#176) * Add hive metastore catalog support (part 1/2) (#237) * fmt members * setup basic test-infra for hms-catalog * add license * add hms create_namespace * add hms get_namespace * fix: typo * add hms namespace_exists and drop_namespace * add hms update_namespace * move fns into HmsCatalog * use `expose` in docker-compose * add hms list_tables * fix: clippy * fix: cargo sort * fix: cargo workspace * move fns into utils + add constants * include database name in error msg * add pilota to cargo workspace * add minio version * change visibility to pub(crate); return namespace from conversion fn * add minio version in rest-catalog docker-compose * fix: hms test docker infrastructure * add version to minio/mc * fix: license header * fix: core-site --------- Co-authored-by: mlanhenke * chore: Enable projects. (#247) * Make plan_files as asynchronous stream (#243) * feat: Implement binding expression (#231) * feat: Implement binding expression * Implement Display instead of ToString (#256) * add rewrite_not (#263) * feat: init TableMetadataBuilder (#262) * Rename stat_table to table_exists in Catalog trait (#257) * feat (static table): implement a read-only table struct loaded from metadata (#259) * fixing some broken branch * adding readonly property to Table, and setting readonly value on StaticTable * feat: implement OAuth for catalog rest client (#254) * docs: annotate precision and length to primitive types (#270) Signed-off-by: Ruihang Xia * build: Restore CI by making parquet and arrow version consistent (#280) * Metadata Serde + default partition_specs and sort_orders (#272) * change serde metadata v2 * change default partition_specs and sort_orders * change test * use DEFAULTS * feat: make optional oauth param configurable (#278) * make optional oauth param configurable * fix review comments. --------- Co-authored-by: hpal * fix: enable public access to ManifestEntry properties (#284) * enable public access to ManifestEntry properties * implementing getter methods instead of direct access * feat: Implement the conversion from Arrow Schema to Iceberg Schema (#258) * feat: Implement the conversion from ArrowSchema to iceberg Schema * For review * Update test * Add LargeString, LargeBinary, LargeList and FixedSizeList * Add decimal type * For review * Fix clippy * Rename funtion name to add_manifests (#293) * feat: modify `Bind` calls so that they don't consume `self` and instead return a new struct, leaving the original unmoved" (#290) * Add hive metastore catalog support (part 2/2) (#285) * fmt members * setup basic test-infra for hms-catalog * add license * add hms create_namespace * add hms get_namespace * fix: typo * add hms namespace_exists and drop_namespace * add hms update_namespace * move fns into HmsCatalog * use `expose` in docker-compose * add hms list_tables * fix: clippy * fix: cargo sort * fix: cargo workspace * move fns into utils + add constants * include database name in error msg * add pilota to cargo workspace * add minio version * change visibility to pub(crate); return namespace from conversion fn * add minio version in rest-catalog docker-compose * fix: hms test docker infrastructure * add version to minio/mc * fix: license header * fix: core-site * split utils and errors * add fn get_default_table_location * add fn get_metadata_location * add docs * add HiveSchemaBuilder * add schema to HiveSchemaBuilder * add convert_to_hive_table * cargo sort * implement table_ops without TableMetadataBuilder * refactor: HiveSchema fn from_iceberg * prepare table creation without metadata * simplify HiveSchemaBuilder * refactor: use ok_or_else() * simplify HiveSchemaBuilder * fix visibility of consts * change serde metadata v2 * change default partition_specs and sort_orders * change test * add create table with metadata * use FileIO::from_path * add test_load_table * small fixes + docs * rename * extract get_metadata_location from hive_table * add integration tests * fix: clippy * remove whitespace * fix: fixture names * remove builder-prefix `with` * capitalize error msg * remove trait bound `Display` * add const `OWNER` * fix: default warehouse location * add test-case `list_tables` * add all primitives to test_schema * exclude `Timestamptz` from hive conversion * remove Self::T from schema * remove context * keep file_io in HmsCatalog * use json schema repr --------- Co-authored-by: mlanhenke * feat: implement prune column for schema (#261) * feat: implement PruneColumn for Schema * fix: fix bugs for PruneColumn implementation * test: add test cases for PruneColumn * fix: fix minor to make more rusty * fix: fix cargo clippy * fix: construct expected_type from SchemaBuilder * fix: more readability * change return type of prune_column * chore(deps): Update reqwest requirement from ^0.11 to ^0.12 (#296) * Glue Catalog: Basic Setup + Test Infra (1/3) (#294) * extend dependency DIRS * create dependencies for glue * basic setup * rename test * add utils/get_sdk_config * add tests * add list_namespace * fix: clippy * fix: unused * fix: workspace * fix: name * use creds in test-setup * fix: empty dependencies.rust.tsv * fix: rename endpoint_url * remove deps.tsv * add hms deps.tsv * fix deps.tsv * fix: deps.tsv * feat: rest client respect prefix prop (#297) * feat: rest client respect prefix prop Signed-off-by: TennyZhuang * add test Signed-off-by: TennyZhuang * fix tests without prefix Signed-off-by: TennyZhuang * fix clippy Signed-off-by: TennyZhuang --------- Signed-off-by: TennyZhuang * fix: missing properties (#303) * fix: renaming FileScanTask.data_file to data_manifest_entry (#300) * renaming FileScanTask.data_file to data_manifest_entry * renaming data_file.content() to content_type() * changing pub method to data() * feat: Make OAuth token server configurable (#305) * feat: Glue Catalog - namespace operations (2/3) (#304) * add from_build_error * impl create_namespace * impl get_namespace * add macro with_catalog_id * impl namespace_exists * impl update_namespace * impl list_tables * impl drop_namespace * fix: clippy * update docs * update docs * fix: naming and visibility of error conversions * feat: add transform_literal (#287) * add transform_literal * refine * fix unwrap --------- Co-authored-by: ZENOTME * feat: Complete predicate builders for all operators. (#276) * feat: Complete predicate builders for all operators. * ci: fix fmt error * fix nan and notnan * feat: Support customized header in Rest catalog client (#306) Note that: the default headers will not be overwritten. * fix: chrono dep (#274) * feat: Read Parquet data file with projection (#245) * feat: Read Parquet data file with projection * fix * Update * More * For review * Use FeatureUnsupported error. * Fix day timestamp micro (#312) * basic fix * change to Result * use try_unary * feat: support uri redirect in rest client (#310) Signed-off-by: TennyZhuang * refine: seperate parquet reader and arrow convert (#313) * Upgrade to rust-version 1.77.1 (#316) * Support identifier warehouses (#308) * Support identifier warehouses This is a bit confusing if you come from a Hive background where the warehouse is always a path to hdfs/s3/etc. With the REST catalog, the warehouse can also be a logical identifier: https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L72-L78 This means that we have to make sure that we only parse paths that are an actual path, and not an identifier. I'm open to suggestions. The check is now very simple, but can be extended for example using a regex. But I'm not sure what the implications are of importing additional packages (in Python you want to keep it as lightweight as possible). * Use `if Url::parse().is_ok()` * feat: Project transform (#309) * add project bucket_unary * add project bucket_binary * add project bucket_set * add project identity * add project truncate * fixed array boundary * add project void * add project unknown * add docs + none projections * docs * docs * remove trait + impl boundary on Datum * fix: clippy * fix: test Transform::Unknown * add: transform_literal_result * add: transform_literal_result * remove: whitespace * move `boundary` to transform.rs * add check if transform can be applied to data_type * add check * add: java-testsuite Transform::Bucket * fix: clippy * add: timestamps to boundary * change: return bool from can_transform * fix: clippy * refactor: fn project match structure * add: java-testsuite Transform::Truncate * add: java-testsuite Transform::Dates + refactor * fix: doc * add: timestamp test + refactor * refactor: simplify projected_boundary * add: java-testsuite Transform::Timestamp * refactor tests * fix: timestamp conversion * fix: temporal test_result * basic fix * change to Result * use try_unary * add: java-testsuite Transform::Timestamp Hours * refactor: split and move tests * refactor: move transform tests * remove self * refactor: structure fn project + helpers * fix: clippy * fix: typo * fix: naming + generics * feat: add Struct Accessors to BoundReferences (#317) * feat: use str args rather than String in transform (#325) * chore(deps): Update pilota requirement from 0.10.0 to 0.11.0 (#327) Updates the requirements on [pilota](https://github.com/cloudwego/pilota) to permit the latest version. - [Release notes](https://github.com/cloudwego/pilota/releases) - [Commits](https://github.com/cloudwego/pilota/compare/pilota-0.10.0...pilota-0.10.0) --- updated-dependencies: - dependency-name: pilota dependency-type: direct:production ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): Bump peaceiris/actions-mdbook from 1 to 2 (#332) Bumps [peaceiris/actions-mdbook](https://github.com/peaceiris/actions-mdbook) from 1 to 2. - [Release notes](https://github.com/peaceiris/actions-mdbook/releases) - [Changelog](https://github.com/peaceiris/actions-mdbook/blob/main/CHANGELOG.md) - [Commits](https://github.com/peaceiris/actions-mdbook/compare/v1...v2) --- updated-dependencies: - dependency-name: peaceiris/actions-mdbook dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): Bump peaceiris/actions-gh-pages from 3.9.3 to 4.0.0 (#333) Bumps [peaceiris/actions-gh-pages](https://github.com/peaceiris/actions-gh-pages) from 3.9.3 to 4.0.0. - [Release notes](https://github.com/peaceiris/actions-gh-pages/releases) - [Changelog](https://github.com/peaceiris/actions-gh-pages/blob/main/CHANGELOG.md) - [Commits](https://github.com/peaceiris/actions-gh-pages/compare/v3.9.3...v4.0.0) --- updated-dependencies: - dependency-name: peaceiris/actions-gh-pages dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): Bump apache/skywalking-eyes from 0.5.0 to 0.6.0 (#328) Bumps [apache/skywalking-eyes](https://github.com/apache/skywalking-eyes) from 0.5.0 to 0.6.0. - [Release notes](https://github.com/apache/skywalking-eyes/releases) - [Changelog](https://github.com/apache/skywalking-eyes/blob/main/CHANGES.md) - [Commits](https://github.com/apache/skywalking-eyes/compare/v0.5.0...v0.6.0) --- updated-dependencies: - dependency-name: apache/skywalking-eyes dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * feat: add BoundPredicateVisitor. Add AlwaysTrue and AlwaysFalse to Predicate (#334) * feat: add InclusiveProjection (#335) * feat: Implement the conversion from Iceberg Schema to Arrow Schema (#277) * support iceberg schema to arrow schema * avoid realloc hashmap --------- Co-authored-by: ZENOTME * Simplify expression when doing `{and,or}` operations (#339) This will make sure that we nicely reduce the expression in the inclusive projection visitor: https://github.com/apache/iceberg-rust/blob/de80a2436bb2fbbd5b4ec6bcafd0bd041b263595/crates/iceberg/src/expr/visitors/inclusive_projection.rs#L73 * feat: Glue Catalog - table operations (3/3) (#314) * add GlueSchemaBuilder * add warehouse * add serde_json, tokio, uuid * add minio * add create_table * add tests utils * add load_table * add drop_table + table_exists * add rename_table * add docs * fix: docs + err_msg * fix: remove unused const * fix: default_table_location * fix: remove single quotes error message * chore: add test-condition `test_rename_table` * chore: add test-condition `test_table_exists` * chore: update roadmap (#336) * chore: update roadmap * chore: update reader section * fix: read into arrow record batch * feat: add ManifestEvaluator (#322) * feat: init iceberg writer (#275) * init iceberg writer * refine * refine the interface --------- Co-authored-by: ZENOTME * feat: implement manifest filtering in TableScan (#323) * Refactor: Extract `partition_filters` from `ManifestEvaluator` (#360) * refactor: extract inclusive_projection from manifest_evaluator * refactor: add FileScanStreamContext * refactor: create partition_spec and partition_schema * refactor: add cache structs * refactor: use entry in partition_file_cache * refactor: use result * chore: update docs + fmt * refactor: add bound_filter to FileScanStreamContext * refactor: return ref BoundPredicate * fix: return type PartitionSpecRef * refactor: remove spec_id runtime check * feat: add check for content_type data * Basic Integration with Datafusion (#324) * chore: basic structure * feat: add IcebergCatalogProvider * feat: add IcebergSchemaProvider * feat: add IcebergTableProvider * chore: add integration test infr * fix: remove old test * chore: update crate structure * fix: remove workspace dep * refactor: use try_join_all * chore: remove feature flag * chore: rename package * chore: update readme * feat: add TableType * fix: import + async_trait * fix: imports + async_trait * chore: remove feature flag * fix: cargo sort * refactor: CatalogProvider `fn try_new` * refactor: SchemaProvider `fn try_new` * chore: update docs * chore: update docs * chore: update doc * feat: impl `fn schema` on TableProvider * chore: rename ArrowSchema * refactor: remove DashMap * feat: add basic IcebergTableScan * chore: fix docs * chore: add comments * fix: clippy * fix: typo * fix: license * chore: update docs * chore: move derive stmt * fix: collect into hashmap * chore: use DFResult * Update crates/integrations/datafusion/README.md Co-authored-by: Liang-Chi Hsieh --------- Co-authored-by: Renjie Liu Co-authored-by: Liang-Chi Hsieh * refactor: cache partition_schema in `fn plan_files()` (#362) * refactor: add partition_schema_cache * refactor: use context as param object * fix: test setup * refactor: clone only when cache miss * chore: move derive stmts * refactor: remove unused case_sensitive parameter * refactor: remove partition_schema_cache * refactor: move partition_filter into wider scope * fix (manifest-list): added serde aliases to support both forms conventions (#365) * added serde aliases to support both forms conventions * reading manifests without avro schema * adding avro files of both versions and add a test to deser both * fixed typo * feat: Extract FileRead and FileWrite trait (#364) * feat: Extract FileRead and FileWrie trait Signed-off-by: Xuanwo * Enable s3 services for tests Signed-off-by: Xuanwo * Fix sort Signed-off-by: Xuanwo * Add comment for io trait Signed-off-by: Xuanwo * Fix test for rest Signed-off-by: Xuanwo * Use try join Signed-off-by: Xuanwo --------- Signed-off-by: Xuanwo * feat: Convert predicate to arrow filter and push down to parquet reader (#295) * feat: Convert predicate to arrow filter and push down to parquet reader * For review * Fix clippy * Change from vector of BoundPredicate to BoundPredicate * Add test for CollectFieldIdVisitor * Return projection_mask for leaf column * Update * For review * For review * For review * For review * More * fix * Fix clippy * More * Fix clippy * fix clippy * chore(deps): Update datafusion requirement from 37.0.0 to 38.0.0 (#369) * chore(deps): Update itertools requirement from 0.12 to 0.13 (#376) * Add `InclusiveMetricsEvaluator` (#347) * feat: add InclusiveMetricsEvaluator * test: add more tests for InclusiveMetricsEvaluator * Rename V2 spec names. (#380) * make file scan task serializable (#377) Co-authored-by: ZENOTME * Feature: Schema into_builder method (#381) * replaced `i32` in `TableUpdate::SetDefaultSortOrder` to `i64` (#387) * fix: make PrimitiveLiteral and Literal not be Ord (#386) * make PrimitiveLiteral and Literal not be Ord * refine Map * fix name * fix map test * refine --------- Co-authored-by: ZENOTME * docs(writer/docker): fix small typos and wording (#389) * docs: fixup docker compose test_utils * docs: iceberg writer close fn * feat: `StructAccessor.get` returns `Result>` instead of `Result` (#390) This is so that the accessor's result can represent null field values. Fixes: #379 * feat: add `ExpressionEvaluator` (#363) * refactor: add partition_schema_cache * refactor: use context as param object * fix: test setup * refactor: clone only when cache miss * chore: move derive stmts * feat: add basic setup expression evaluator * refactor: remove unused case_sensitive parameter * chore: add doc * refactor: remove partition_schema_cache * refactor: move partition_filter into wider scope * feat: add expression_evaluator_cache and apply in scan.rs * chore: remove comment * refactor: remove unused test setup fn * feat: add basic test infr + simple predicate evaluation * fix: clippy * feat: impl `is_null` + `not_null` * feat: impl `is_nan` + `not_nan` * chore: change result type * feat: impl `less_than` + `greater_than` * chore: fix return type * feat: impl `eq` + `not_eq` * feat: impl `starts_with + `not_starts_with` * feat: impl + * chore: add tests for and and or expr * chore: move test * chore: remove unused_vars * chore: update docs * chore: update docs * fix: typo * refactor: compare datum instead of primitive literal * refactor: use Result