Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dl/coordinator: add tag to redpanda commits #25038

Merged
merged 2 commits into from
Feb 7, 2025

Conversation

andrwng
Copy link
Contributor

@andrwng andrwng commented Feb 5, 2025

Adds a tag to snapshots created by Redpanda. This ensures that the
latest Redpanda snapshot won't be removed by snapshot expiry, since tags
don't expire by default. This assurance helps us guarantee exactly once
delivery even in the face of external table updates.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.3.x
  • v24.2.x
  • v24.1.x

Release Notes

  • none

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Feb 5, 2025

Retry command for Build#61608

please wait until all jobs are finished before running the slash command



/ci-repeat 1
tests/rptest/tests/datalake/datalake_e2e_test.py::DatalakeE2ETests.test_avro_schema@{"catalog_type":"rest_jdbc","cloud_storage_type":1,"query_engine":"spark"}
tests/rptest/tests/datalake/datalake_e2e_test.py::DatalakeE2ETests.test_avro_schema@{"catalog_type":"rest_hadoop","cloud_storage_type":1,"query_engine":"spark"}
tests/rptest/tests/datalake/datalake_e2e_test.py::DatalakeE2ETests.test_avro_schema@{"catalog_type":"rest_jdbc","cloud_storage_type":1,"query_engine":"trino"}
tests/rptest/tests/datalake/datalake_e2e_test.py::DatalakeE2ETests.test_avro_schema@{"catalog_type":"rest_hadoop","cloud_storage_type":1,"query_engine":"trino"}

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Feb 5, 2025

CI test results

test results on build#61608
test_id test_kind job_url test_status passed
rptest.tests.compaction_recovery_test.CompactionRecoveryTest.test_index_recovery ducktape https://buildkite.com/redpanda/redpanda/builds/61608#0194d5dd-b529-42cd-8ecd-ea7be153fedf FLAKY 1/2
rptest.tests.datalake.compaction_test.CompactionGapsTest.test_translation_no_gaps.cloud_storage_type=CloudStorageType.S3.catalog_type=CatalogType.REST_HADOOP ducktape https://buildkite.com/redpanda/redpanda/builds/61608#0194d5ca-8ada-4340-8fbd-1d53521f126c FLAKY 1/2
rptest.tests.datalake.datalake_dlq_test.DatalakeDLQTest.test_invalid_record_action_runtime_change.cloud_storage_type=CloudStorageType.S3.query_engine=QueryEngineType.SPARK.use_topic_property=True ducktape https://buildkite.com/redpanda/redpanda/builds/61608#0194d5dd-b529-42cd-8ecd-ea7be153fedf FLAKY 1/2
rptest.tests.datalake.datalake_e2e_test.DatalakeE2ETests.test_avro_schema.cloud_storage_type=CloudStorageType.S3.query_engine=QueryEngineType.SPARK.catalog_type=CatalogType.REST_HADOOP ducktape https://buildkite.com/redpanda/redpanda/builds/61608#0194d5ca-8adc-48f2-8f04-6543f884d759 FAIL 0/20
rptest.tests.datalake.datalake_e2e_test.DatalakeE2ETests.test_avro_schema.cloud_storage_type=CloudStorageType.S3.query_engine=QueryEngineType.SPARK.catalog_type=CatalogType.REST_HADOOP ducktape https://buildkite.com/redpanda/redpanda/builds/61608#0194d5dd-b528-4551-8088-6bfb3198d6e5 FAIL 0/20
rptest.tests.datalake.datalake_e2e_test.DatalakeE2ETests.test_avro_schema.cloud_storage_type=CloudStorageType.S3.query_engine=QueryEngineType.SPARK.catalog_type=CatalogType.REST_JDBC ducktape https://buildkite.com/redpanda/redpanda/builds/61608#0194d5ca-8ad9-4f75-91b4-792f417a8414 FAIL 0/20
rptest.tests.datalake.datalake_e2e_test.DatalakeE2ETests.test_avro_schema.cloud_storage_type=CloudStorageType.S3.query_engine=QueryEngineType.SPARK.catalog_type=CatalogType.REST_JDBC ducktape https://buildkite.com/redpanda/redpanda/builds/61608#0194d5dd-b529-42cd-8ecd-ea7be153fedf FAIL 0/20
rptest.tests.datalake.datalake_e2e_test.DatalakeE2ETests.test_avro_schema.cloud_storage_type=CloudStorageType.S3.query_engine=QueryEngineType.TRINO.catalog_type=CatalogType.REST_HADOOP ducktape https://buildkite.com/redpanda/redpanda/builds/61608#0194d5ca-8ada-4340-8fbd-1d53521f126c FAIL 0/20
rptest.tests.datalake.datalake_e2e_test.DatalakeE2ETests.test_avro_schema.cloud_storage_type=CloudStorageType.S3.query_engine=QueryEngineType.TRINO.catalog_type=CatalogType.REST_HADOOP ducktape https://buildkite.com/redpanda/redpanda/builds/61608#0194d5dd-b529-4420-ac71-fa63c36e7f6c FAIL 0/20
rptest.tests.datalake.datalake_e2e_test.DatalakeE2ETests.test_avro_schema.cloud_storage_type=CloudStorageType.S3.query_engine=QueryEngineType.TRINO.catalog_type=CatalogType.REST_JDBC ducktape https://buildkite.com/redpanda/redpanda/builds/61608#0194d5ca-8adb-4425-ad48-7fba6a8eb87b FAIL 0/20
rptest.tests.datalake.datalake_e2e_test.DatalakeE2ETests.test_avro_schema.cloud_storage_type=CloudStorageType.S3.query_engine=QueryEngineType.TRINO.catalog_type=CatalogType.REST_JDBC ducktape https://buildkite.com/redpanda/redpanda/builds/61608#0194d5dd-b527-4bd6-816b-c7100a5d6d5a FAIL 0/20
rptest.tests.write_caching_fi_e2e_test.WriteCachingFailureInjectionE2ETest.test_crash_all_with_consumer_group ducktape https://buildkite.com/redpanda/redpanda/builds/61608#0194d5dd-b529-4420-ac71-fa63c36e7f6c FLAKY 1/2
test results on build#61628
test_id test_kind job_url test_status passed
rptest.tests.cloud_storage_chunk_read_path_test.CloudStorageChunkReadTest.test_read_chunks ducktape https://buildkite.com/redpanda/redpanda/builds/61628#0194d7e7-78f5-44eb-b0e2-3cfe437010c6 FLAKY 1/2
rptest.tests.datalake.compaction_test.CompactionGapsTest.test_translation_no_gaps.cloud_storage_type=CloudStorageType.S3.catalog_type=CatalogType.REST_HADOOP ducktape https://buildkite.com/redpanda/redpanda/builds/61628#0194d804-1fc4-4dbd-ba85-a2d7c5f5ca33 FLAKY 1/2
rptest.tests.datalake.compaction_test.CompactionGapsTest.test_translation_no_gaps.cloud_storage_type=CloudStorageType.S3.catalog_type=CatalogType.REST_JDBC ducktape https://buildkite.com/redpanda/redpanda/builds/61628#0194d804-1fc2-498f-8f66-3808fa6331a5 FLAKY 1/2
rptest.tests.datalake.mount_unmount_test.MountUnmountIcebergTest.test_simple_unmount.cloud_storage_type=CloudStorageType.S3 ducktape https://buildkite.com/redpanda/redpanda/builds/61628#0194d7e7-78f6-443d-ae9f-5d2e50342349 FLAKY 1/2
rptest.tests.e2e_shadow_indexing_test.EndToEndShadowIndexingTestCompactedTopic.test_compacting_during_leadership_transfer.cloud_storage_type=CloudStorageType.ABS ducktape https://buildkite.com/redpanda/redpanda/builds/61628#0194d804-1fc4-4dbd-ba85-a2d7c5f5ca33 FLAKY 1/2
rptest.tests.e2e_shadow_indexing_test.ShadowIndexingWhileBusyTest.test_create_or_delete_topics_while_busy.short_retention=True.cloud_storage_type=CloudStorageType.ABS ducktape https://buildkite.com/redpanda/redpanda/builds/61628#0194d804-1fc3-4688-b6f3-b125d6c1a798 FLAKY 1/2
rptest.tests.scaling_up_test.ScalingUpTest.test_scaling_up_with_recovered_topic ducktape https://buildkite.com/redpanda/redpanda/builds/61628#0194d804-1fc3-4688-b6f3-b125d6c1a798 FLAKY 1/2
test results on build#61697
test_id test_kind job_url test_status passed
rptest.tests.compaction_recovery_test.CompactionRecoveryTest.test_index_recovery ducktape https://buildkite.com/redpanda/redpanda/builds/61697#0194dd7d-3dab-41ed-bf09-ab49a69bce3e FLAKY 1/2
rptest.tests.compaction_recovery_test.CompactionRecoveryTest.test_index_recovery ducktape https://buildkite.com/redpanda/redpanda/builds/61697#0194dd99-2f15-4002-a265-5cf6fdfa445a FLAKY 1/2
rptest.tests.compaction_recovery_test.CompactionRecoveryUpgradeTest.test_index_recovery_after_upgrade ducktape https://buildkite.com/redpanda/redpanda/builds/61697#0194dd7d-3dac-48c0-aa66-ad6c82376181 FLAKY 1/2
rptest.tests.datalake.compaction_test.CompactionGapsTest.test_translation_no_gaps.cloud_storage_type=CloudStorageType.S3.catalog_type=CatalogType.REST_JDBC ducktape https://buildkite.com/redpanda/redpanda/builds/61697#0194dd99-2f12-4a12-88db-9637faf59494 FLAKY 1/3
rptest.tests.scaling_up_test.ScalingUpTest.test_scaling_up_with_recovered_topic ducktape https://buildkite.com/redpanda/redpanda/builds/61697#0194dd99-2f14-4770-b196-11101576d2d6 FLAKY 1/2

@andrwng andrwng force-pushed the dl-coordinator-tag-appends branch from 1bde4d8 to 1958111 Compare February 5, 2025 19:40
@oleiman
Copy link
Member

oleiman commented Feb 6, 2025

This ensures that the latest Redpanda snapshot won't be removed by snapshot expiry, since tags don't expire by default.

Is this part of the Iceberg spec? Cursory check didn't turn anything up, but I may not be searching for the right thing.

EDIT: ah, i see, tags & branches have separate expiration policies. so in principle we could still get in a funky situation w/ customized table maintenance, but the default behavior is what we want?

Copy link
Member

@oleiman oleiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few question but generally lgtm

@andrwng
Copy link
Contributor Author

andrwng commented Feb 6, 2025

This ensures that the latest Redpanda snapshot won't be removed by snapshot expiry, since tags don't expire by default.

Is this part of the Iceberg spec? Cursory check didn't turn anything up, but I may not be searching for the right thing.

EDIT: ah, i see, tags & branches have separate expiration policies. so in principle we could still get in a funky situation w/ customized table maintenance, but the default behavior is what we want?

Generally it's not part of the spec, but it is in the Java implementation of snapshot expiry -- the default is set per table. Though you raise a good point that there is a mechanism per snapshot that we can set to not expire. Will update.

@andrwng andrwng force-pushed the dl-coordinator-tag-appends branch from 1958111 to 491e6d1 Compare February 6, 2025 20:56
Adds the capability to tag a snapshot when appended. This will be useful
for pinning a snapshot to the table, since tagged snapshots don't expire
by default. We'd like to keep some metadata around in Redpanda-generated
snapshots for exactly once delivery, and this helps ensure that metadata
don't get wiped out.
@andrwng andrwng force-pushed the dl-coordinator-tag-appends branch from 491e6d1 to d3f1710 Compare February 6, 2025 20:57
@andrwng andrwng requested review from oleiman and bharathv February 6, 2025 21:01
oleiman
oleiman previously approved these changes Feb 6, 2025
Copy link
Member

@oleiman oleiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. thanks for various clarifications

Adds a tag to snapshots created by Redpanda. This ensures that the
latest Redpanda snapshot won't be removed by snapshot expiry, since tags
don't expire by default. This assurance helps us guarantee exactly once
delivery even in the face of external table updates.
@andrwng andrwng merged commit 094551f into redpanda-data:dev Feb 7, 2025
18 checks passed
{{commit_meta_prop, to_json_str(commit_meta)}});
{{commit_meta_prop, to_json_str(commit_meta)}},
commit_tag_name,
/*tag_expiration_ms=*/std::numeric_limits<long>::max());
Copy link
Contributor

@nvartolomei nvartolomei Feb 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this should be std::numeric_limits<long long>::max() std::numeric_limits<int64_t>::max() (together with the underlying type def). We never compile for a platform where this would result in 32 bit integer afaik but being explicit doesn't hurt.

long is defined as "at least" 32 bit by the C++ standard, on unix x86/arm 64bit it is actually a 64 bit number so all good, windows is 32 bit (who cares)

32 bit max as ms is ~24 days

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this should be int64_t in case the definition of long long changes... (i.e. it becomes 128 bit) long in iceberg spec is defined as a 64 bit wide integer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, #25118

andrwng added a commit to andrwng/redpanda that referenced this pull request Feb 19, 2025
Longs in Iceberg are 64-bit, so we should be explicit about that.

Review follow-up from
redpanda-data#25038
@WillemKauf
Copy link
Contributor

Nessie dislikes this:

https://github.com/projectnessie/nessie/blob/e0014a800976fe038379aea8905cb5cd4fce56b0/catalog/format/iceberg/src/main/java/org/projectnessie/catalog/formats/iceberg/rest/IcebergMetadataUpdate.java#L707-L709

2025-02-21 05:30:06,267 INFO  [io.qua.htt.access-log] (executor-thread-1) 192.168.215.24 - - [21/Feb/2025:05:30:06 +0000] "GET /iceberg/v1/main/namespaces/redpanda/tables/tapioca HTTP/1.1" 200 3858
refName(): redpanda.tag, type(): tag
2025-02-21 05:30:06,283 WARN  [org.pro.cat.ser.res.IcebergErrorMapper] (executor-thread-1) Unhandled exception returned as HTTP/500: java.lang.IllegalStateException: Nessie only supports the current snapshot-ref 'main', use Nessie's branches instead: java.lang.IllegalStateException: Nessie only supports the current snapshot-ref 'main', use Nessie's branches instead

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants