Bug fixes for create_vdev #381

xiaoxichen · 2024-04-15T17:04:48Z

max_num_chunks should based on vdev_size/min_chunk_size, previous implementation wrongly include other pdevs that doesnt belongs to this vdev(e.g different type).
we should never round up vdev size as this might causing vdev size exceed underlaying pdev size.
a few other fixes.

Add a set of sanity checks as well as UT to ensure chunks are align to pdev->align_size().

Fixes: #377

1. max_num_chunks should based on vdev_size/min_chunk_size, previous implementation wrongly include other pdevs that doesnt belongs to this vdev(e.g different type). 2. we should never round up vdev size as this might causing vdev size exceed underlaying pdev size. 3. a few other fixes. Add a set of sanity checks as well as UT to ensure chunks are align to pdev->align_size(). Fixes: eBay#377 Signed-off-by: Xiaoxi Chen <[email protected]>

src/lib/device/device_manager.cpp

yamingk · 2024-04-15T18:07:10Z

src/lib/device/device_manager.cpp

-    }
+    // Based on the min chunk size, we calculate the max number of chunks that can be created in each target pdev
+    uint32_t min_chunk_size = hs_super_blk::min_chunk_size(vparam.dev_type);
+    // FIXME: it is possible that each vdev is less than max_num_chunks, but total is more than MAX_CHUNKS_IN_SYSTEM.


This is a true statement, but it won't cause any problem, right?
max_num_chunks only garunteens chunk_size won't be smaller than the min_chunk_size.
The actual vparam.num_chunks is a std::min of itself and this max_num_chunks, line: 223.
And the input vparam.num_chunks is provided by HS consumer and this consumer can control all the services requested in HS won't exceed MAX_CHUNK_IN_SYSTEM, right? (We can have an assert at the end of format_and_start to make sure all the num_chunks for each request services don't exceed MAX_CHUNK_IN_SYSTEM).

Yes it will not causing issue ,just fail the create vdev request.
I was thinking to tune the num_chunks based on the available chunk count. i.e if we have 1000 chunks available and the input requested 2000, we can tune that down to 1000... this is just a minor improvement .

It should not fail the create vdev request, if the consumer of HS set the correct num_chunks for each services during hs.format_and_start, right?

Your thought is also valid, that consumer just supply a total num_chunks consumer want to create and let HS do whatever trick to figure out available chunks.

Thinking about the case, we have 64K chunk limit across all vdevs. This 64K is a magic number that the client do not know.
Even client knows like homeobject, several service created with

HomeStore::instance()->format_and_start({ {HS_SERVICE::META, hs_format_params{.dev_type = run_on_type, .size_pct = 5.0}}, {HS_SERVICE::LOG, hs_format_params{.dev_type = run_on_type, .size_pct = 10.0, .chunk_size = 32 * Mi}}, {HS_SERVICE::INDEX, hs_format_params{.dev_type = run_on_type, .size_pct = 5.0}}, {HS_SERVICE::REPLICATION, hs_format_params{.dev_type = run_on_type, .size_pct = 79.0, .num_chunks = 63000, .block_size = _data_block_size, .alloc_type = blk_allocator_type_t::append, .chunk_sel_type = chunk_selector_type_t::CUSTOM}}, });

It is very hard to know if we are exceeding 64K, as LOG device based on chunk_size, num_chunks is in-direct. META/INDEX doesnt specified num_chunk but based on the default value of hs_format_params....

Later if LOG moved to dynamic size it is more tricky as the max size of the log device can be limited by the available chunk count left by other svc. e.g if data take 65500 then only 34 chunks left for log dev .

I agree right now it is a well-known number. We can get it through a static API to get the max_num_chunks from homestore (it can optimally to return the "real" max number, already taking min_chunk_size into account).

I was thinking consumer need to specify num_chunks for all the requested services. In your case above, if we want to support only one service specify num_chunks, and let other services to land automatically based on vdev size, then yes we also need to play some tricks in homestore.

Yes, maybe lets ignore the well known number part for now, this is a easy fix issue whenever we want to fix.

I was thinking consumer need to specify num_chunks for all the requested services.

I feel very confused regarding the path of dynamic size vdev introduced by LOG.... by dynamic sizing it naturally cannot provide num_chunks, but we need to reserve N chunks for it to be dynamic within [0, N * CHUNK_SIZE)

yamingk · 2024-04-15T18:08:04Z

src/lib/device/device_manager.cpp

    }
+    // sanity checks
+    RELEASE_ASSERT(vparam.vdev_size % vparam.chunk_size == 0, "vdev_size should be multiple of chunk_size");


good set of asserts I was about to suggest.

src/lib/device/device_manager.cpp

yamingk · 2024-04-15T18:31:41Z

src/tests/test_device_manager.cpp

@@ -174,6 +175,29 @@ TEST_F(DeviceMgrTest, StripedVDevCreation) {
    this->validate_striped_vdevs();
 }

+TEST_F(DeviceMgrTest, SmallStripedVDevCreation) {


Not suggesting it needs to be included in this PR. We can open tickets to add:

Mixed drive types with very large drive size (simulated), with only provide chunk_size or num_chunks.

Pure HDD with very large drive size (32TB), with only chunk_size or num_chunks.

Pure Fast drive with very large drive size (32TB), with only chunk_size or num_chunks.

Create vdev not providing num_chunks with size_type equal to dynamic, e.g. test start with zero num_chunks

Signed-off-by: Xiaoxi Chen <[email protected]>

previously we have use-after-release as during restart HS all chunk are released. But after restart we dont refresh our vdev vector which still refering to old chunks. Signed-off-by: Xiaoxi Chen <[email protected]>

JacksonYao287 · 2024-04-16T11:44:29Z

src/lib/device/device_manager.cpp

@@ -189,8 +190,9 @@ shared< VirtualDev > DeviceManager::create_vdev(vdev_parameters&& vparam) {

    // Determine if we have a devices available on requested dev_tier. If so use them, else fallback to data tier


this comment is no longer true since we explicitly specify the device type, right? just delete it if we will not fallback to data tier

JacksonYao287 · 2024-04-16T11:48:58Z

src/lib/device/device_manager.cpp

-
+    RELEASE_ASSERT_GT(pdevs.size(), 0, "Unable to find any pdevs for given vdev type, can't create vdev");
+    RELEASE_ASSERT(vparam.blk_size % pdevs[0]->align_size() == 0, "blk_size should be multiple of pdev align_size");
+    RELEASE_ASSERT(vparam.blk_size % 512 == 0, "blk_size should be multiple of 512bytes to be DMA aligned");


vparam.blk_size % 512 == 0

IMO, this assert is unnecessary. we already have vparam.blk_size % pdevs[0]->align_size() == 0. any blk_size that aligns with pdevs[0]->align_size() must align with 512 , pdevs[0]->align_size() is always multiple of 512.

JacksonYao287 · 2024-04-16T14:20:32Z

src/lib/device/device_manager.cpp

+    // Based on the min chunk size, we calculate the max number of chunks that can be created in each target pdev
+    uint32_t min_chunk_size = hs_super_blk::min_chunk_size(vparam.dev_type);
+    // FIXME: it is possible that each vdev is less than max_num_chunks, but total is more than MAX_CHUNKS_IN_SYSTEM.
+    // uint32 convert is safe as it only overflow when vdev size > 64PB with 16MB min_chunk_size.


should we add an assert to make sure vdev size is smaller than or eaqual to 64 PB?

I dont think we can have a vdev > 64PB in any near future.

Signed-off-by: Xiaoxi Chen <[email protected]>

xiaoxichen requested review from yamingk, hkadayam and JacksonYao287 April 15, 2024 17:07

yamingk reviewed Apr 15, 2024

View reviewed changes

src/lib/device/device_manager.cpp Outdated Show resolved Hide resolved

yamingk reviewed Apr 15, 2024

View reviewed changes

src/lib/device/device_manager.cpp Show resolved Hide resolved

yamingk reviewed Apr 15, 2024

View reviewed changes

address comments

930246e

Signed-off-by: Xiaoxi Chen <[email protected]>

yamingk previously approved these changes Apr 16, 2024

View reviewed changes

xiaoxichen dismissed yamingk’s stale review via eba8f86 April 16, 2024 05:41

Fix logdev UT that use 8MB chunk size

add6fd2

Signed-off-by: Xiaoxi Chen <[email protected]>

xiaoxichen force-pushed the fix branch from eba8f86 to add6fd2 Compare April 16, 2024 07:02

Fix device_manager ut

e9f06ce

previously we have use-after-release as during restart HS all chunk are released. But after restart we dont refresh our vdev vector which still refering to old chunks. Signed-off-by: Xiaoxi Chen <[email protected]>

JacksonYao287 reviewed Apr 16, 2024

View reviewed changes

JacksonYao287 mentioned this pull request Apr 16, 2024

fix io_uring error #380

Closed

comments

683ad1c

Signed-off-by: Xiaoxi Chen <[email protected]>

yamingk approved these changes Apr 17, 2024

View reviewed changes

xiaoxichen merged commit 6f93ea4 into eBay:master Apr 17, 2024
21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug fixes for create_vdev #381

Bug fixes for create_vdev #381

xiaoxichen commented Apr 15, 2024

yamingk Apr 15, 2024 •

edited

Loading

xiaoxichen Apr 16, 2024

yamingk Apr 16, 2024 •

edited

Loading

xiaoxichen Apr 16, 2024

yamingk Apr 16, 2024 •

edited

Loading

xiaoxichen Apr 16, 2024 •

edited

Loading

yamingk Apr 15, 2024

yamingk Apr 15, 2024 •

edited

Loading

JacksonYao287 Apr 16, 2024

JacksonYao287 Apr 16, 2024

JacksonYao287 Apr 16, 2024

xiaoxichen Apr 16, 2024

		@@ -189,8 +190,9 @@ shared< VirtualDev > DeviceManager::create_vdev(vdev_parameters&& vparam) {

		// Determine if we have a devices available on requested dev_tier. If so use them, else fallback to data tier

Bug fixes for create_vdev #381

Bug fixes for create_vdev #381

Conversation

xiaoxichen commented Apr 15, 2024

yamingk Apr 15, 2024 • edited Loading

Choose a reason for hiding this comment

xiaoxichen Apr 16, 2024

Choose a reason for hiding this comment

yamingk Apr 16, 2024 • edited Loading

Choose a reason for hiding this comment

xiaoxichen Apr 16, 2024

Choose a reason for hiding this comment

yamingk Apr 16, 2024 • edited Loading

Choose a reason for hiding this comment

xiaoxichen Apr 16, 2024 • edited Loading

Choose a reason for hiding this comment

yamingk Apr 15, 2024

Choose a reason for hiding this comment

yamingk Apr 15, 2024 • edited Loading

Choose a reason for hiding this comment

JacksonYao287 Apr 16, 2024

Choose a reason for hiding this comment

JacksonYao287 Apr 16, 2024

Choose a reason for hiding this comment

JacksonYao287 Apr 16, 2024

Choose a reason for hiding this comment

xiaoxichen Apr 16, 2024

Choose a reason for hiding this comment

yamingk Apr 15, 2024 •

edited

Loading

yamingk Apr 16, 2024 •

edited

Loading

yamingk Apr 16, 2024 •

edited

Loading

xiaoxichen Apr 16, 2024 •

edited

Loading

yamingk Apr 15, 2024 •

edited

Loading