Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Load CDK] test improvements #50406

Merged
merged 3 commits into from
Jan 3, 2025

Conversation

edgao
Copy link
Contributor

@edgao edgao commented Dec 24, 2024

closes https://github.com/airbytehq/airbyte-internal-issues/issues/11218, closes https://github.com/airbytehq/airbyte-internal-issues/issues/11221

as usual, just look at each commit individually:

  • avro/parquet data dumpers no longer coerce to the expected schema
  • add some missed test cases
  • maybe fix the unknown types test?
    • afaict, we are just triggering the null-out behavior?
    • but also, iceberg doesn't even have the FailOnAllUnknownTypesExceptNull mapper, and it's also doing this
    • so I'm kind of confused 🤔

@edgao edgao requested a review from johnny-schmidt December 24, 2024 00:58
@edgao edgao requested a review from a team as a code owner December 24, 2024 00:58
Copy link

vercel bot commented Dec 24, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
airbyte-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 3, 2025 0:07am

val stream =
DestinationStream(
DestinationStream.Descriptor(randomizedNamespace, "problematic_types"),
Append,
ObjectType(
linkedMapOf(
"id" to
"id" to intType,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding an explicit id field helps RecordDiffer generate a nicer diff (i.e. instead of saying "there was an extra record, and also a missing record", it'll be able to match the records together + diff their fields)

Base automatically changed from edgao/iceberg_timestamp_fix to master January 3, 2025 00:01
@edgao edgao force-pushed the edgao/avro_parquet_data_dumping_without_coercion branch from 1d3fe56 to 6b4d481 Compare January 3, 2025 00:01
@@ -37,6 +37,9 @@ dependencies {
// temporary dependencies so that we can continue running the legacy test suite.
// eventually we should remove those tests + rely solely on the bulk CDK tests.
integrationTestLegacyImplementation testFixtures(project(":airbyte-cdk:java:airbyte-cdk:airbyte-cdk-s3-destinations"))

// TODO this should come from from the cdk plugin + respect the cdk version
integrationTestImplementation testFixtures(project(":airbyte-cdk:bulk:toolkits:bulk-cdk-toolkit-load-avro"))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi - submitted https://github.com/airbytehq/airbyte-internal-issues/issues/11245 + assigned it to myself, I want to knock this out before I forget, but don't want to block this PR on it

@edgao edgao merged commit 022b316 into master Jan 3, 2025
33 checks passed
@edgao edgao deleted the edgao/avro_parquet_data_dumping_without_coercion branch January 3, 2025 00:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants