Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CORE-8760] rptest: fix race-y checks in test_index_recovery_after_upgrade #25131

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

WillemKauf
Copy link
Contributor

@WillemKauf WillemKauf commented Feb 21, 2025

This test can race with segment rolls and un-self-compacted segments, since self compaction may not occur before redpanda version changes and node restarts occur. Thus, breaking the expectations around the compacted index mtime() stats.

To fix the race conditions, return the partition data used to evaluate compaction as being finished, and use that to perform the equality/inequality checks between compacted index mtime() values.

Also ensure that all segements present in the partition data that are produced during a version epoch are self-compacted in that same epoch, per expectations of the test.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.3.x
  • v24.2.x
  • v24.1.x

Release Notes

  • none

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Feb 21, 2025

CI test results

test results on build#62107
test_id test_kind job_url test_status passed
rptest.tests.compaction_recovery_test.CompactionRecoveryUpgradeTest.test_index_recovery_after_upgrade ducktape https://buildkite.com/redpanda/redpanda/builds/62107#019529d5-d4b9-4c50-b297-c88be6119185 FLAKY 1/2
test results on build#62115
test_id test_kind job_url test_status passed
rptest.tests.compaction_recovery_test.CompactionRecoveryUpgradeTest.test_index_recovery_after_upgrade ducktape https://buildkite.com/redpanda/redpanda/builds/62115#01952ab8-c161-472a-bcec-3c67a7e0e6dd FLAKY 1/2
rptest.tests.compaction_recovery_test.CompactionRecoveryUpgradeTest.test_index_recovery_after_upgrade ducktape https://buildkite.com/redpanda/redpanda/builds/62115#01952acd-b310-4402-b4ce-6475c54cd11d FLAKY 1/2
rptest.tests.retention_policy_test.ShadowIndexingCloudRetentionTest.test_cloud_time_based_retention.cloud_storage_type=CloudStorageType.S3 ducktape https://buildkite.com/redpanda/redpanda/builds/62115#01952acd-b30f-4995-9089-dc1af7203903 FLAKY 1/2
rptest.tests.scaling_up_test.ScalingUpTest.test_scaling_up_with_recovered_topic ducktape https://buildkite.com/redpanda/redpanda/builds/62115#01952ab8-c163-4118-b725-5008ec3e1087 FLAKY 1/2
storage_e2e_single_thread_rpunit.storage_e2e_single_thread_rpunit unit https://buildkite.com/redpanda/redpanda/builds/62115#01952a71-6eda-462c-8080-0b4a053566d4 FLAKY 1/2
test results on build#62124
test_id test_kind job_url test_status passed
rptest.tests.compaction_recovery_test.CompactionRecoveryUpgradeTest.test_index_recovery_after_upgrade ducktape https://buildkite.com/redpanda/redpanda/builds/62124#01952bb0-c90a-468d-b827-13559641f932 FLAKY 50/51
rptest.tests.compaction_recovery_test.CompactionRecoveryUpgradeTest.test_index_recovery_after_upgrade ducktape https://buildkite.com/redpanda/redpanda/builds/62124#01952bb0-c90a-4bf7-8a90-f52e84cbab5b FLAKY 50/53
rptest.tests.compaction_recovery_test.CompactionRecoveryUpgradeTest.test_index_recovery_after_upgrade ducktape https://buildkite.com/redpanda/redpanda/builds/62124#01952bb4-2fe3-4789-8f94-86b3428de412 FLAKY 50/54
rptest.tests.compaction_recovery_test.CompactionRecoveryUpgradeTest.test_index_recovery_after_upgrade ducktape https://buildkite.com/redpanda/redpanda/builds/62124#01952bb4-2fe3-4e3d-9c76-92cd013c08d5 FLAKY 50/52

@WillemKauf
Copy link
Contributor Author

Still flakey. Hmm.

@WillemKauf WillemKauf force-pushed the compaction_recovery_upgrade_test_fix branch from ad6debc to ef46591 Compare February 21, 2025 21:34
This test can race with segment rolls and un-self-compacted segments,
since self compaction may not occur before `redpanda` version changes
and node restarts occur. Thus, breaking the expectations around the
compacted index `mtime()` stats.

To fix the race conditions, return the `partition` data used to evaluate
compaction as being finished, and use that to perform the equality/inequality
checks between compacted index `mtime()` values.

Also ensure that all segements present in the `partition` data that are produced
during a version epoch are self-compacted in that same epoch, per expectations
of the test.
@WillemKauf WillemKauf force-pushed the compaction_recovery_upgrade_test_fix branch from ef46591 to 854b56a Compare February 21, 2025 21:35
@WillemKauf
Copy link
Contributor Author

Force push to:

  • Add new condition to finished_compaction() to ensure all segments in partition data used to evaluate mtime() equality/inequality have finished self-compaction in the version epoch in which they were produced.

@WillemKauf
Copy link
Contributor Author

/ci-repeat 2
skip-build
skip-units
dt-repeat=50
tests/rptest/tests/compaction_recovery_test.py

@WillemKauf WillemKauf enabled auto-merge February 22, 2025 05:14
@WillemKauf WillemKauf disabled auto-merge February 22, 2025 17:20
@WillemKauf
Copy link
Contributor Author

WillemKauf commented Feb 22, 2025

Still flaky.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants