Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-35576]Fix a corruption bug in CreateColumnFamilyWithImport() (#12602) #77

Merged
merged 7 commits into from
Jul 16, 2024

Conversation

mayuehappy
Copy link

@mayuehappy mayuehappy commented Jul 2, 2024

We support the API related to ingest DB in FRocksDb-8.10.0, but many of the fixes related to ingest DB were only integrated in the latest RocksDB version. So we need to add following commit cherrypicks to FRocksDB.
The ingestDb relative commits are:
facebook/rocksdb#12526
facebook/rocksdb#12602

I found that unit testing did not pass for two main reasons:

  1. AssignEpochNumberToMultipleCF failed mainly because the following changes require cherry picking to frocksdb
    Speedup based on number of files marked for compaction facebook/rocksdb#12306
    Detect compaction pressure at lower debt ratios facebook/rocksdb#12236
    Speedup based on pending compaction bytes relative to data size facebook/rocksdb#12130

  2. Some single tests related to folly cannot be successfully constructed. Since folly is not currently used in Frocksdb, I think it is possible to temporarily remove ut related to folly

…nFamilyJob::Prepare (#12526)

Summary:
This PR fixes error for CF smallest and largest keys computation in ImportColumnFamilyJob::Prepare.
Before this fix smallest and largest keys for CF were computed incorrectly, and ImportColumnFamilyJob::Prepare function might not have detect overlaps between CFs. I added test to detect this error.

Pull Request resolved: facebook/rocksdb#12526

Reviewed By: hx235

Differential Revision: D56046044

Pulled By: ajkr

fbshipit-source-id: d562fbfc9cc2d9624372d24d34a649198a960691
(cherry picked from commit 70d3fc3b6f0bebc3f45e34cc7c3f9fa8ab064fdb)
@mayuehappy mayuehappy force-pushed the FLINK-35576 branch 6 times, most recently from 5eebf1e to 2f491ed Compare July 10, 2024 12:59
cbi42 and others added 4 commits July 11, 2024 11:07
Summary:
when importing files from multiple CFs into a new CF, we were reusing the epoch numbers assigned by the original CFs. This means L0 files in the new CF can have the same epoch number (assigned originally by different CFs). While CreateColumnFamilyWithImport() requires each original CF to have disjoint key range, after an intra-l0 compaction, we still can end up with L0 files with the same epoch number but overlapping key range. This PR attempt to fix this by reassigning epoch numbers when importing multiple CFs.

Pull Request resolved: facebook/rocksdb#12602

Test Plan:
a new repro unit test. Before this PR, it fails with
```
[ RUN      ] ImportColumnFamilyTest.AssignEpochNumberToMultipleCF
db/import_column_family_test.cc:1048: Failure
db_->WaitForCompact(o)
Corruption: force_consistency_checks(DEBUG): VersionBuilder: L0 files of same epoch number but overlapping range facebook/rocksdb#44 , smallest key: '6B6579303030303030' seq:511, type:1 , largest key: '6B6579303031303239' seq:510, type:1 , epoch number: 3 vs. file facebook/rocksdb#36 , smallest key: '6B6579303030313030' seq:401, type:1 , largest key: '6B6579303030313939' seq:500, type:1 , epoch number: 3
```

Reviewed By: hx235

Differential Revision: D56851808

Pulled By: cbi42

fbshipit-source-id: 01b8c790c9f1f2a168047ead670e73633f705b84

(cherry picked from commit 6fdc4c52823d0b32bb18321b0d4b14ab70d09e92)
Summary:
RocksDB self throttles per-DB compaction parallelism until it detects compaction pressure. The pressure detection based on pending compaction bytes was only comparing against the slowdown trigger (`soft_pending_compaction_bytes_limit`). Online services tend to set that extremely high to avoid stalling at all costs. Perhaps they should have set it to zero, but we never documented that zero disables stalling so I have been telling everyone to increase it for years.

This PR adds pressure detection based on pending compaction bytes relative to the size of bottommost data. The size of bottommost data should be fairly stable and proportional to the logical data size

Pull Request resolved: facebook/rocksdb#12130

Reviewed By: hx235

Differential Revision: D52000746

Pulled By: ajkr

fbshipit-source-id: 7e1fd170901a74c2d4a69266285e3edf6e7631c7
(cherry picked from commit d8e4762)
Summary:
This PR significantly reduces the compaction pressure threshold introduced in facebook/rocksdb#12130 by a factor of 64x. The original number was too high to trigger in scenarios where compaction parallelism was needed.

Pull Request resolved: facebook/rocksdb#12236

Reviewed By: cbi42

Differential Revision: D52765685

Pulled By: ajkr

fbshipit-source-id: 8298e966933b485de24f63165a00e672cb9db6c4
(cherry picked from commit 2dda7a0)
Summary:
RocksDB self throttles per-DB compaction parallelism until it detects compaction pressure. This PR adds pressure detection based on the number of files marked for compaction.

Pull Request resolved: facebook/rocksdb#12306

Reviewed By: cbi42

Differential Revision: D53200559

Pulled By: ajkr

fbshipit-source-id: 63402ee336881a4539204d255960f04338ab7a0e
(cherry picked from commit aacf60dda2a138f9d3826c25818a3bcf250859fd)
@Zakelly
Copy link
Collaborator

Zakelly commented Jul 12, 2024

The pick list LGTM. I'm also fine with disabling the folly CI. But it would be great if we could fix folly. IIUC, the folly is the essential dependency for async IO, right?

Summary:
- Updated pinned folly version to the latest
- gcc/g++ 10 is required since facebook/folly@2c1c617e9e so we had to modify the tests using gcc/g++ 7
- libsodium 1.0.17 is no longer downloadable from GitHub so I found it elsewhere. I will submit a PR for that upstream to folly
- USE_FOLLY_LITE changes
  - added boost header dependency instead of commenting out the `#include`s since that approach stopped working
  - added "folly/lang/Exception.cpp" to the compilation

Pull Request resolved: facebook/rocksdb#12795

Reviewed By: hx235

Differential Revision: D58916693

Pulled By: ajkr

fbshipit-source-id: b5f9bca2d929825846ac898b785972b071db62b1

(cherry picked from commit 40944cbbdbdcfac694fc3b291ba1838e943a789b)
Summary:
facebook/folly@843fd57 fixed the URL for libsodium. Updated folly version to latest, which includes that commit. I am not sure the URL will be stable, but it still seems better than substituting the URL.

Pull Request resolved: facebook/rocksdb#12801

Reviewed By: cbi42

Differential Revision: D58921033

Pulled By: ajkr

fbshipit-source-id: 442ea3ff83ced2679ea9bfd04945e9449ce2ff96
(cherry picked from commit 13549817afd97ce29705c42e87fa945056fd2d11)
@mayuehappy
Copy link
Author

The pick list LGTM. I'm also fine with disabling the folly CI. But it would be great if we could fix folly. IIUC, the folly is the essential dependency for async IO, right?

@Zakelly Thanks for the reviewing , I've fix the UT test with folly

Copy link
Collaborator

@Zakelly Zakelly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mayuehappy !

@Zakelly Zakelly merged commit 9a10201 into ververica:FRocksDB-8.10.0 Jul 16, 2024
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants