-
Notifications
You must be signed in to change notification settings - Fork 731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
variable: return native Go pointers to bpf map memory #1607
base: main
Are you sure you want to change the base?
Conversation
1c12429
to
ea5f31d
Compare
2d43b45
to
821aa80
Compare
func VariablePointer[T any](v *Variable) (*T, error) { | ||
if err := checkVariable[T](v); err != nil { | ||
return nil, fmt.Errorf("variable pointer %s: %w", v.name, err) | ||
} | ||
return memoryPointer[T](v.mm, v.offset) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like it is possible to support slices within the same API, does it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in the current proposal, no. I think supporting slices is a hazard, and we should probably document that clearly. (append()
will reallocate and copy, unsharing the backing array)
Of course the caller can always slice the array, but we should encourage passing around the array pointer instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can either of you refresh my mind again what the killer feature for this is? Seems like @mejedi is interested in using this for slices?
} | ||
|
||
var t T | ||
size := binary.Size(t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this binary.Size and not unsafe.Sizeof?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also doesn't take unsafe.Alignof
into account.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this binary.Size and not unsafe.Sizeof?
unsafe.Sizeof()
tolerates variable-sized types like untyped ints and also returns the size of the slice descriptor if the argument is a slice. binary.Size()
rejects both. I think it's safer to reject variable-sized for now to avoid potential surprises.
This also doesn't take unsafe.Alignof into account.
Don't the next few lines take care of ensuring aligned access?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's safer to reject variable-sized for now to avoid potential surprises.
Unfortunately binary.Size ignores trailing padding, which means that it is not safe to use that function here.
struct foo {
uint64_t a;
uint32_t b;
// implicit 4 byte padding
}
I agree with you that these types shouldn't be allowed, but I think it might be better to do that in the caller of this function. Then we have a separation of concerns:
- this function ensures that we only transmute memory that is "safe" per the memory model, respecting full size and alignment.
- the caller can further restrict this to sized types, maybe ones that embed
structs.HostLayout
, etc.
Don't the next few lines take care of ensuring aligned access?
The alignment of a type can be smaller than its size, so I think that the check below is incorrect in the sense that it rejects correctly aligned structs. For example:
struct foo {
uint64_t a, b;
}
This must be aligned to 8 bytes, not 16.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately binary.Size ignores trailing padding, which means that it is not safe to use that function here.
I see. I guess that's the reasoning behind unsafeBackingMemory
as well; there you're using reflect.Type.Size()
, which is analogous to unsafe.Sizeof()
. Then, you compare it to binary.Size()
to detect padding.
Thanks for the feedback! I'll see what I can come up with. Looks like we also need better tests. 🙂
// heap allocation, even individual struct fields or arbitrary offsets within | ||
// a slice. In this case, finalizers set on struct fields or slice offsets | ||
// will only run when the whole struct or backing array are collected. The | ||
// accepted runtime.AddCleanup proposal makes this behaviour more explicit and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does all of this still work when switching to AddCleanup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to check, but afaik AddCleanup is SetFinalizer modulo object resurrection.
} | ||
|
||
var t T | ||
size := binary.Size(t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be clearer to check that unsafe.Sizeof() == binary.Size()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would make things more complex, not less. The reader would have to be aware of the intricacies and the differences between both, and we'd still need the comparison to v.size
, so it wouldn't eliminate any code.
memory_unsafe_test.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are lots of assumptions that aren't explored yet:
- GC behaviour on mappings which need to be aligned and therefore don't alias the original allocation
- What happens when the heap mapping is dropped while the mapmap still exists
- What happens when a user sets a finalizer on type-cast memory
- Test that things are actually freed at the right point in time
- Test that a pointer at an offset to the original allocation keeps the whole thing alive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- GC behaviour on mappings which need to be aligned and therefore don't alias the original allocation
- What happens when a user sets a finalizer on type-cast memory
https://go.dev/play/p/FKVnnU85C8b
- Test that a pointer at an offset to the original allocation keeps the whole thing alive
Could this question be nuanced? This seems to be inherent to how GC works.
- What happens when the heap mapping is dropped while the mapmap still exists
Not sure how this can happen without a moving GC and while there are still references to a span.
- Test that things are actually freed at the right point in time
Let me come up with something.
Signed-off-by: Timo Beckers <[email protected]>
Signed-off-by: Timo Beckers <[email protected]>
I think the most pressing use case is atomics, which is currently not supported at all. Array variables currently work through Get() and Set(), but there's always a marshaling cost. Also, simply having the convenience of manipulating BPF variables like they are Go variables is nice. |
} | ||
|
||
var t T | ||
size := binary.Size(t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's safer to reject variable-sized for now to avoid potential surprises.
Unfortunately binary.Size ignores trailing padding, which means that it is not safe to use that function here.
struct foo {
uint64_t a;
uint32_t b;
// implicit 4 byte padding
}
I agree with you that these types shouldn't be allowed, but I think it might be better to do that in the caller of this function. Then we have a separation of concerns:
- this function ensures that we only transmute memory that is "safe" per the memory model, respecting full size and alignment.
- the caller can further restrict this to sized types, maybe ones that embed
structs.HostLayout
, etc.
Don't the next few lines take care of ensuring aligned access?
The alignment of a type can be smaller than its size, so I think that the check below is incorrect in the sense that it rejects correctly aligned structs. For example:
struct foo {
uint64_t a, b;
}
This must be aligned to 8 bytes, not 16.
|
||
// Bump the value by 1 using a bpf program. | ||
want := uint32(1338) | ||
a32, err := VariablePointer[atomic.Uint32](obj.Atomic) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should've said: it also only contains unexported fields. Passing the type to binary.Read or Write will cause an error. Is that the correct choice to make here?
func unmap(size int) func(*byte) { | ||
return func(a *byte) { | ||
// Create another mapping at the same address to undo the original mapping. | ||
// This will cause the kernel to repair the slab since we're using the same |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would happen if the kernel didn't "repair the slab" for some reason, and kept a separate mapping instead? Nothing in the man pages suggests that this is required behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would happen if the kernel didn't "repair the slab" for some reason, and kept a separate mapping instead?
I just documented this as an observation; you're right that this isn't documented behaviour, it's just how Linux behaves. There would be no behavioural change if the kernel doesn't repair the slab (basically, if the kernel no longer defragments contiguous page table entries/vmas). munmap takes addr and length, so if the runtime decides to release the mspan, it would munmap the whole range it thought it owned, including the intermediate mappings created by our hack (https://pubs.opengroup.org/onlinepubs/009696799/functions/munmap.html):
The munmap() function shall remove any mappings for those entire pages containing any part of the address space of the process starting at addr and continuing for len bytes.
// align the requested allocation to a page boundary. This is needed for | ||
// MAP_FIXED and makes sure we don't mmap over some other allocation on the Go | ||
// heap. | ||
size = internal.Align(size+os.Getpagesize(), os.Getpagesize()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something we discussed on the previous PR I think: this will be expensive on platforms that have large page sizes like arm64. Not sure there is anything we can do about it, but might make sense to call it out in the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, okay. Looks like 64k pages need to be explicitly selected while building the kernel, and Ubuntu still defaults to 4k pages for now. This is all moot when runtime.AddForeignCleanup lands, so I'm not convinced we should burden the API docs with this detail.
Pending the proposal in golang/go#70224, here's the current API proposal for returning Go pointers to 'off-heap' bpf map memory:
We need a bit of help from the Go runtime to manage the lifecycle of the underlying memory mapping. Currently, this mmaps over a piece of Go heap to allow the GC to track pointers into the bpf map memory, obviating the need for tying the mmap's lifetime to an
ebpf.Memory
. Instead, any Go pointers into the mmap will keep it alive, removing the risk of use-after-free.Clearly, this involves some dark arts, so we're not comfortable pushing this on users since
Collection.Variables
is populated by default. Any change in the runtime or allocator may break this, and the fallout means Go programs segfaulting without any hint as to why, which is not acceptable.--- Edit ---
Since the
runtime.AddManualMemoryCleanup
proposal may take a while to implement, land and ship in a Go version we target, I've revived the patch set and put the heap-mapping behaviour behindCollectionOptions.UnsafeVariableExperiment
.MemoryPointer
is yet unexported, sinceMap.Memory()
is typically called directly by the user and we'd need to store the feature flag in *Map, which doesn't feel ideal.Alternatively, we could go for an env feature flag instead of CollectionOptions (
EBPFUNSAFEVARIABLES=true
?), which would allow us to export MemoryPointer and transparantly switch between implementations.