openhcl: refactor shared pool to general purpose page pool #260

chris-oo · 2024-11-07T18:11:39Z

Refactor the shared visibility pool to a more general purpose page pool. This is in preparation for additional changes to support save restore, and private allocations.

Add a new new_private_pool method to support future private memory page pools.

Add additional tracking of allocations with device_ids and device names, which will be used for save restore. This can also help track current allocations. Update NvmeManager to hand out per-device allocators, which better helps allocations per device.

In the future, it would be good to additionally update the VfioDmaBuffer trait to have additional tagging, as today all allocations are just tagged as "vfio dma" which isn't very helpful when inspecting.

openhcl/page_pool_alloc/src/lib.rs

openhcl/underhill_core/src/worker.rs

openhcl/page_pool_alloc/src/lib.rs

openhcl/virt_mshv_vtl/src/processor/tdx/tlb_flush.rs

openhcl/page_pool_alloc/src/lib.rs

Cargo.lock

jstarks · 2024-11-25T16:26:03Z

openhcl/underhill_core/src/worker.rs

+            .as_ref()
+            .map(|p| p.allocator_spawner());
+
+        let vfio_dma_buffer_spawner =


I think we'll ultimately want a trait for this so it's clearer what we're passing around. But this is fine for now.

Agreed. I think this whole thing will get another refactor with the central DMA management code coming soon, so I don't want to do too much overkill...

openhcl/underhill_core/src/nvme_manager.rs

jstarks · 2024-11-25T16:37:24Z

openhcl/page_pool_alloc/src/lib.rs

+    device_ids: Vec<String>,
+}
+
+// Manually implement inspect so device_ids can be rendered as strings, not


Another way to have done this might have been to make a StateView<'a> type that mirrors State but has a resolved &str device ID, then derive inspect on that. And then in the inspect impl, just .map State to StateView<'_>.

Interesting. Do we do this pattern anywhere else? I hate losing #[derive(inspect)] because it's so nice.

openhcl/page_pool_alloc/src/lib.rs

jstarks · 2024-11-25T16:49:35Z

openhcl/page_pool_alloc/src/lib.rs

+            })
+            .expect("must find allocation");
+
+        inner.state[index] = State::Free {


So we still don't do any merging on free.

I guess that's OK in practice since we just allocate during startup and don't really free.

On the flip side, I wonder if we should stop handing out contiguous allocations and just always hand out a list of pages. Do any callers rely on contiguous pages?

(Don't need to change anything for this PR, but I'd like to understand the requirements.)

IIRC all the usage by devices expect contiguous pages, for queues and bounce buffers (unless I'm wrong and they're all just 4k each?). A lot of other callers just need a single 4K page, but I'd have to double check them all.

And yeah, for free we don't really expect any runtime allocations (the GET is the only one that might free but... we should probably fix that).

NVMe queues are 4k each and this probably will never change. NVMe bounce buffers are bigger - 512k as of today, should be reduced once we investigate that.

Do any callers rely on contiguous pages

yes, as discussed in NVMe save/restore, the assumption will be that pages are contiguous.

jstarks · 2024-11-25T16:51:52Z

openhcl/page_pool_alloc/src/lib.rs

+                anyhow::bail!("device name {device_name} already in use");
+            }
+
+            inner.device_ids.push(device_name.clone());


We never free this.

don't undersatnd this comment - the clone is required because we also store the string version of the id in the allocator itself.

I've actually removed this in the save-restore PR coming next. It's useless as it turns out. I can remove it here if you want, or it's already gone in my tip of tree.

No, you're pushing onto the device_ids array but you never free.

Ah I see. If you drop the allocator, we should remove you from the array. But, the policy is a bit weird - do we allow you to get your previous ID if you create an allocator with the same state? You may have pending allocations that are tracked by a different handle.

My proposal is this:

The device_id is always tied to this device, but if you free the allocator, you can create it again. We'll track Used/Unassigned state internally within the allocator. This is because I do not think with the scheme we have (storing indices inside the allocations themselves) there is any sane/safe way to remove the device_id without walking every allocation.

Outstanding allocations are allowed to stay live, and are still tied to the allocator that may be marked as Unassigned. We only use this for device_id lookups for inspect anyways. If you create a new allocator with the same device_name, we link it to the existing id as you should be the same device.

openhcl/page_pool_alloc/src/lib.rs

jstarks

A few nits

renames rename crate use device id for uniqueness save spawner tag nvme allocations per device new lock thiserror lock actually fix thiserror undo cargo.toml implement friendly inspect fix macos actually fix macos boxed spawner

yupavlen-ms · 2024-11-27T17:46:37Z

openhcl/underhill_core/src/nvme_manager.rs

@@ -82,15 +84,15 @@ impl NvmeManager {
    pub fn new(
        driver_source: &VmTaskDriverSource,
        vp_count: u32,
-        dma_buffer: Arc<dyn VfioDmaBuffer>,
+        dma_buffer_spawner: Box<dyn Fn(String) -> anyhow::Result<Arc<dyn VfioDmaBuffer>> + Send>,


Can we define a type for this?

We can yeah. Lets take it as a follow up.

…#260) Refactor the shared visibility pool to a more general purpose page pool. This is in preparation for additional changes to support save restore, and private allocations. Add a new `new_private_pool` method to support future private memory page pools. Add additional tracking of allocations with device_ids and device names, which will be used for save restore. This can also help track current allocations. Update `NvmeManager` to hand out per-device allocators, which better helps allocations per device. In the future, it would be good to additionally update the VfioDmaBuffer trait to have additional tagging, as today all allocations are just tagged as "vfio dma" which isn't very helpful when inspecting.

Refactor the shared visibility pool to a more general purpose page pool. This is in preparation for additional changes to support save restore, and private allocations. Add a new `new_private_pool` method to support future private memory page pools. Add additional tracking of allocations with device_ids and device names, which will be used for save restore. This can also help track current allocations. Update `NvmeManager` to hand out per-device allocators, which better helps allocations per device. In the future, it would be good to additionally update the VfioDmaBuffer trait to have additional tagging, as today all allocations are just tagged as "vfio dma" which isn't very helpful when inspecting. Backport of #260

jstarks · 2025-02-06T06:26:10Z

Backported in #474

chris-oo requested review from a team as code owners November 7, 2024 18:11

jstarks reviewed Nov 7, 2024

View reviewed changes

openhcl/page_pool_alloc/src/lib.rs Outdated Show resolved Hide resolved

jstarks reviewed Nov 7, 2024

View reviewed changes

openhcl/page_pool_alloc/src/lib.rs Outdated Show resolved Hide resolved

jstarks reviewed Nov 7, 2024

View reviewed changes

openhcl/page_pool_alloc/src/lib.rs Outdated Show resolved Hide resolved

jstarks reviewed Nov 7, 2024

View reviewed changes

openhcl/page_pool_alloc/src/lib.rs Outdated Show resolved Hide resolved

jstarks reviewed Nov 7, 2024

View reviewed changes

openhcl/underhill_core/src/worker.rs Outdated Show resolved Hide resolved

yupavlen-ms reviewed Nov 7, 2024

View reviewed changes

openhcl/page_pool_alloc/src/lib.rs Outdated Show resolved Hide resolved

smalis-msft reviewed Nov 7, 2024

View reviewed changes

openhcl/virt_mshv_vtl/src/processor/tdx/tlb_flush.rs Show resolved Hide resolved

mattkur reviewed Nov 8, 2024

View reviewed changes

openhcl/page_pool_alloc/src/lib.rs Outdated Show resolved Hide resolved

mattkur reviewed Nov 8, 2024

View reviewed changes

openhcl/page_pool_alloc/src/lib.rs Show resolved Hide resolved

chris-oo force-pushed the page-pool-refactor branch from be6a2d7 to 3259bdc Compare November 11, 2024 16:45

chris-oo marked this pull request as draft November 11, 2024 17:26

yupavlen-ms mentioned this pull request Nov 11, 2024

shared_pool_alloc: zero out allocated DMA buffers #267

Merged

chris-oo force-pushed the page-pool-refactor branch from 131b81d to 149ae0b Compare November 14, 2024 22:59

chris-oo commented Nov 15, 2024

View reviewed changes

Cargo.lock Outdated Show resolved Hide resolved

chris-oo marked this pull request as ready for review November 15, 2024 19:09

chris-oo force-pushed the page-pool-refactor branch from 96e6fdc to 488801c Compare November 19, 2024 00:25

jstarks reviewed Nov 25, 2024

View reviewed changes

openhcl/underhill_core/src/nvme_manager.rs Outdated Show resolved Hide resolved

jstarks reviewed Nov 25, 2024

View reviewed changes

openhcl/underhill_core/src/nvme_manager.rs Show resolved Hide resolved

jstarks reviewed Nov 25, 2024

View reviewed changes

openhcl/page_pool_alloc/src/lib.rs Outdated Show resolved Hide resolved

jstarks reviewed Nov 25, 2024

View reviewed changes

openhcl/page_pool_alloc/src/lib.rs Outdated Show resolved Hide resolved

jstarks reviewed Nov 25, 2024

View reviewed changes

openhcl/page_pool_alloc/src/lib.rs Show resolved Hide resolved

jstarks reviewed Nov 25, 2024

View reviewed changes

openhcl/page_pool_alloc/src/lib.rs Outdated Show resolved Hide resolved

jstarks reviewed Nov 25, 2024

View reviewed changes

openhcl/page_pool_alloc/src/lib.rs Outdated Show resolved Hide resolved

jstarks requested changes Nov 25, 2024

View reviewed changes

chris-oo force-pushed the page-pool-refactor branch from e92bab3 to fa3ac76 Compare November 25, 2024 19:28

chris-oo added 2 commits November 25, 2024 20:13

feedback

24bdd17

no rename, handle allocator being dropped

e8e7cb9

jstarks approved these changes Nov 27, 2024

View reviewed changes

yupavlen-ms reviewed Nov 27, 2024

View reviewed changes

chris-oo merged commit ea5de5e into microsoft:main Dec 2, 2024
24 checks passed

chris-oo added the backport_2411 Change should be backported to the release/2411 branch label Dec 2, 2024

chris-oo mentioned this pull request Dec 12, 2024

openhcl: refactor shared pool to general purpose page pool (#260) #474

Merged

jstarks added backported_2411 PR that has been backported to release/2411 and removed backport_2411 Change should be backported to the release/2411 branch labels Feb 6, 2025

openhcl: refactor shared pool to general purpose page pool #260

openhcl: refactor shared pool to general purpose page pool #260

Uh oh!

Conversation

chris-oo commented Nov 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chris-oo Nov 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jstarks left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jstarks commented Feb 6, 2025

Uh oh!

Uh oh!

chris-oo commented Nov 7, 2024 •

edited

Loading

chris-oo Nov 26, 2024 •

edited

Loading