variable: return native Go pointers to bpf map memory #1607

ti-mo · 2024-11-07T14:00:25Z

Pending the proposal in golang/go#70224, here's the current API proposal for returning Go pointers to 'off-heap' bpf map memory:

func MemoryPointer[T any](mm *Memory, off uint64) (*T, error)
func VariablePointer[T any](v *Variable) (*T, error)

We need a bit of help from the Go runtime to manage the lifecycle of the underlying memory mapping. Currently, this mmaps over a piece of Go heap to allow the GC to track pointers into the bpf map memory, obviating the need for tying the mmap's lifetime to an ebpf.Memory. Instead, any Go pointers into the mmap will keep it alive, removing the risk of use-after-free.

Clearly, this involves some dark arts, so we're not comfortable pushing this on users since Collection.Variables is populated by default. Any change in the runtime or allocator may break this, and the fallout means Go programs segfaulting without any hint as to why, which is not acceptable.

--- Edit ---

Since the runtime.AddManualMemoryCleanup proposal may take a while to implement, land and ship in a Go version we target, I've revived the patch set and put the heap-mapping behaviour behind CollectionOptions.UnsafeVariableExperiment. MemoryPointer is yet unexported, since Map.Memory() is typically called directly by the user and we'd need to store the feature flag in *Map, which doesn't feel ideal.

Alternatively, we could go for an env feature flag instead of CollectionOptions (EBPFUNSAFEVARIABLES=true?), which would allow us to export MemoryPointer and transparantly switch between implementations.

mejedi · 2025-03-04T07:46:54Z

variable.go

+func VariablePointer[T any](v *Variable) (*T, error) {
+	if err := checkVariable[T](v); err != nil {
+		return nil, fmt.Errorf("variable pointer %s: %w", v.name, err)
+	}
+	return memoryPointer[T](v.mm, v.offset)
+}


It doesn't look like it is possible to support slices within the same API, does it?

Not in the current proposal, no. I think supporting slices is a hazard, and we should probably document that clearly. (append() will reallocate and copy, unsharing the backing array)

Of course the caller can always slice the array, but we should encourage passing around the array pointer instead.

lmb

Can either of you refresh my mind again what the killer feature for this is? Seems like @mejedi is interested in using this for slices?

memory_unsafe.go

lmb · 2025-03-10T17:19:37Z

memory_unsafe.go

+	}
+
+	var t T
+	size := binary.Size(t)


Why is this binary.Size and not unsafe.Sizeof?

This also doesn't take unsafe.Alignof into account.

Why is this binary.Size and not unsafe.Sizeof?

unsafe.Sizeof() tolerates variable-sized types like untyped ints and also returns the size of the slice descriptor if the argument is a slice. binary.Size() rejects both. I think it's safer to reject variable-sized for now to avoid potential surprises.

This also doesn't take unsafe.Alignof into account.

Don't the next few lines take care of ensuring aligned access?

I think it's safer to reject variable-sized for now to avoid potential surprises.

Unfortunately binary.Size ignores trailing padding, which means that it is not safe to use that function here.

struct foo { uint64_t a; uint32_t b; // implicit 4 byte padding }

I agree with you that these types shouldn't be allowed, but I think it might be better to do that in the caller of this function. Then we have a separation of concerns:

this function ensures that we only transmute memory that is "safe" per the memory model, respecting full size and alignment.

the caller can further restrict this to sized types, maybe ones that embed structs.HostLayout, etc.

Don't the next few lines take care of ensuring aligned access?

The alignment of a type can be smaller than its size, so I think that the check below is incorrect in the sense that it rejects correctly aligned structs. For example:

struct foo { uint64_t a, b; }

This must be aligned to 8 bytes, not 16.

Unfortunately binary.Size ignores trailing padding, which means that it is not safe to use that function here.

I see. I guess that's the reasoning behind unsafeBackingMemory as well; there you're using reflect.Type.Size(), which is analogous to unsafe.Sizeof(). Then, you compare it to binary.Size() to detect padding.

Thanks for the feedback! I'll see what I can come up with. Looks like we also need better tests. 🙂

lmb · 2025-03-10T17:26:08Z

memory_unsafe.go

+	// heap allocation, even individual struct fields or arbitrary offsets within
+	// a slice. In this case, finalizers set on struct fields or slice offsets
+	// will only run when the whole struct or backing array are collected. The
+	// accepted runtime.AddCleanup proposal makes this behaviour more explicit and


Does all of this still work when switching to AddCleanup?

Need to check, but afaik AddCleanup is SetFinalizer modulo object resurrection.

lmb · 2025-03-10T17:38:21Z

variable.go

+	}
+
+	var t T
+	size := binary.Size(t)


Would it be clearer to check that unsafe.Sizeof() == binary.Size()?

I think that would make things more complex, not less. The reader would have to be aware of the intricacies and the differences between both, and we'd still need the comparison to v.size, so it wouldn't eliminate any code.

variable_test.go

memory_unsafe.go

lmb · 2025-03-10T17:51:40Z

memory_unsafe_test.go

I think there are lots of assumptions that aren't explored yet:

GC behaviour on mappings which need to be aligned and therefore don't alias the original allocation

What happens when the heap mapping is dropped while the mapmap still exists

What happens when a user sets a finalizer on type-cast memory

Test that things are actually freed at the right point in time

Test that a pointer at an offset to the original allocation keeps the whole thing alive

GC behaviour on mappings which need to be aligned and therefore don't alias the original allocation

What happens when a user sets a finalizer on type-cast memory

https://go.dev/play/p/FKVnnU85C8b

Test that a pointer at an offset to the original allocation keeps the whole thing alive

Could this question be nuanced? This seems to be inherent to how GC works.

What happens when the heap mapping is dropped while the mapmap still exists

Not sure how this can happen without a moving GC and while there are still references to a span.

Test that things are actually freed at the right point in time

Let me come up with something.

collection.go

Signed-off-by: Timo Beckers <[email protected]>

ti-mo · 2025-03-13T11:41:17Z

Can either of you refresh my mind again what the killer feature for this is? Seems like @mejedi is interested in using this for slices?

I think the most pressing use case is atomics, which is currently not supported at all. Array variables currently work through Get() and Set(), but there's always a marshaling cost. Also, simply having the convenience of manipulating BPF variables like they are Go variables is nice.

lmb · 2025-03-17T11:58:51Z

memory_unsafe.go

+	}
+
+	var t T
+	size := binary.Size(t)


I think it's safer to reject variable-sized for now to avoid potential surprises.

Unfortunately binary.Size ignores trailing padding, which means that it is not safe to use that function here.

struct foo { uint64_t a; uint32_t b; // implicit 4 byte padding }

I agree with you that these types shouldn't be allowed, but I think it might be better to do that in the caller of this function. Then we have a separation of concerns:

this function ensures that we only transmute memory that is "safe" per the memory model, respecting full size and alignment.

the caller can further restrict this to sized types, maybe ones that embed structs.HostLayout, etc.

Don't the next few lines take care of ensuring aligned access?

The alignment of a type can be smaller than its size, so I think that the check below is incorrect in the sense that it rejects correctly aligned structs. For example:

struct foo { uint64_t a, b; }

This must be aligned to 8 bytes, not 16.

lmb · 2025-03-17T12:01:04Z

variable_test.go

+
+	// Bump the value by 1 using a bpf program.
+	want := uint32(1338)
+	a32, err := VariablePointer[atomic.Uint32](obj.Atomic)


I should've said: it also only contains unexported fields. Passing the type to binary.Read or Write will cause an error. Is that the correct choice to make here?

memory_unsafe.go

lmb · 2025-03-17T12:19:20Z

memory_unsafe.go

+func unmap(size int) func(*byte) {
+	return func(a *byte) {
+		// Create another mapping at the same address to undo the original mapping.
+		// This will cause the kernel to repair the slab since we're using the same


What would happen if the kernel didn't "repair the slab" for some reason, and kept a separate mapping instead? Nothing in the man pages suggests that this is required behaviour.

What would happen if the kernel didn't "repair the slab" for some reason, and kept a separate mapping instead?

I just documented this as an observation; you're right that this isn't documented behaviour, it's just how Linux behaves. There would be no behavioural change if the kernel doesn't repair the slab (basically, if the kernel no longer defragments contiguous page table entries/vmas). munmap takes addr and length, so if the runtime decides to release the mspan, it would munmap the whole range it thought it owned, including the intermediate mappings created by our hack (https://pubs.opengroup.org/onlinepubs/009696799/functions/munmap.html):

The munmap() function shall remove any mappings for those entire pages containing any part of the address space of the process starting at addr and continuing for len bytes.

lmb · 2025-03-17T12:24:03Z

memory_unsafe.go

+	// align the requested allocation to a page boundary. This is needed for
+	// MAP_FIXED and makes sure we don't mmap over some other allocation on the Go
+	// heap.
+	size = internal.Align(size+os.Getpagesize(), os.Getpagesize())


Something we discussed on the previous PR I think: this will be expensive on platforms that have large page sizes like arm64. Not sure there is anything we can do about it, but might make sense to call it out in the docs.

Hmm, okay. Looks like 64k pages need to be explicitly selected while building the kernel, and Ubuntu still defaults to 4k pages for now. This is all moot when runtime.AddForeignCleanup lands, so I'm not convinced we should burden the API docs with this detail.

ti-mo mentioned this pull request Nov 7, 2024

bpf2go: type-safe ebpf.Variable wrappers #1543

Draft

2 tasks

ti-mo mentioned this pull request Feb 18, 2025

Unsafe variable pointer #1691

Closed

ti-mo changed the title ~~[WIP] variable: return native Go pointers to underlying bpf map values~~ variable: return native Go pointers to underlying bpf map values Feb 25, 2025

ti-mo force-pushed the tb/atomic-memory branch from 1c12429 to ea5f31d Compare February 27, 2025 09:27

ti-mo changed the title ~~variable: return native Go pointers to underlying bpf map values~~ variable: return native Go pointers to bpf map memory Feb 27, 2025

ti-mo force-pushed the tb/atomic-memory branch 3 times, most recently from 2d43b45 to 821aa80 Compare February 27, 2025 15:13

ti-mo marked this pull request as ready for review February 27, 2025 15:48

ti-mo requested a review from a team as a code owner February 27, 2025 15:48

mejedi reviewed Mar 4, 2025

View reviewed changes

lmb reviewed Mar 10, 2025

View reviewed changes

ti-mo added 2 commits March 13, 2025 10:59

memory: add accessors for values represented by a Memory

c726290

Signed-off-by: Timo Beckers <[email protected]>

variable: add accessors for values represented by a Variable

fe8e5b1

Signed-off-by: Timo Beckers <[email protected]>

ti-mo force-pushed the tb/atomic-memory branch from 821aa80 to fe8e5b1 Compare March 13, 2025 11:36

ti-mo requested a review from lmb March 13, 2025 11:43

lmb reviewed Mar 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

variable: return native Go pointers to bpf map memory #1607

variable: return native Go pointers to bpf map memory #1607

ti-mo commented Nov 7, 2024 •

edited

Loading

mejedi Mar 4, 2025 •

edited

Loading

ti-mo Mar 13, 2025

lmb left a comment

lmb Mar 10, 2025

lmb Mar 10, 2025

ti-mo Mar 13, 2025

lmb Mar 17, 2025

ti-mo Mar 25, 2025

lmb Mar 10, 2025

ti-mo Mar 13, 2025

lmb Mar 10, 2025

ti-mo Mar 13, 2025

lmb Mar 10, 2025

ti-mo Mar 13, 2025

ti-mo commented Mar 13, 2025

lmb Mar 17, 2025

lmb Mar 17, 2025

lmb Mar 17, 2025

ti-mo Mar 25, 2025

lmb Mar 17, 2025

ti-mo Mar 25, 2025

variable: return native Go pointers to bpf map memory #1607

Are you sure you want to change the base?

variable: return native Go pointers to bpf map memory #1607

Conversation

ti-mo commented Nov 7, 2024 • edited Loading

mejedi Mar 4, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lmb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ti-mo commented Mar 13, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ti-mo commented Nov 7, 2024 •

edited

Loading

mejedi Mar 4, 2025 •

edited

Loading