Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow arena.Arena to compress arbitrary Go pointers, if it owns them #359

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

mcy
Copy link
Member

@mcy mcy commented Oct 28, 2024

This PR adds Arena[T].Compress, which will take *T and compress it into an arena.Pointer[T] if the arena owns that pointer.

This in turn allows us to shrink the size of every AST node by a whole machine word, by dropping redundant arena pointers in typed exported nodes like ast.Syntax. We also don't bother to store the kind of the node either, since it can be recomputed from its type.

This change also renames Arena.New to Arena.NewCompressed, and now Arena.New returns a *T.

@mcy mcy requested a review from jhump October 28, 2024 17:06
Copy link
Member

@jhump jhump left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, but I'm still a little worried about the code that actually does the pointer comparisons.

Comment on lines 27 to 29
// KeepAlive escapes its argument, so this ensures that a and b have
// escaped to the heap and won't be moved.
runtime.KeepAlive([2]unsafe.Pointer{a, b})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Go docs say to only use runtime.KeepAlive to prevent a finalizer from running. So this feels like it could be brittle and thus dangerous.

Also, this construction -- passing an array of pointers instead of a pointer -- is very counter-intuitive. I guess this relies on the fact that pointers in the heap can never point to address on the stack? It feels safe to do this earlier with the original pointers instead of with the unsafe.Pointer aliases, something like so:

runtime.KeepAlive(p)
runtime.KeepAlive(&s[0])

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh the reason for doing it this way is to only have to pay for one function call. KeepAlive cannot be inlined. Sorry, I am very used to working around Go not having tuples by using arrays. XD

Also, &s[0] is incorrect, because it will trigger a redundant bounds check. You still want unsafe.SliceHeader.

Separately, reproduced from Slack:

it's not the right way to do it. there are a few alternatives. one is to open-code KeepAlive, because the current implementation will escape the pointers. we could instead mark the function as go:nosplit. this is used within the runtime in many places as "do not move the pointers I just received as arguments", sufficiently that people have copied the pattern and changing it would almost certainly break the world. note that even if we don't annotate it, the current version of Go won't fuck us: stack growth only happens on function entry, and that is the only situation under which go will move pointers
also, i struggle to think of a way they could move the pointers out from under us after casting to uintptr, and somehow discriminating from the case where we have an unsafe.Pointer. and also, this requires the move to happen after we've cast one of the pointers but not the others, and the result is that we report that the arena does not contain a pointer it actually contains
like with my native compiler person hat on i think the risk is nonexistent
i'm happy to write an essay about this in the comments

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm trying to think if we even need anything. Maybe //go:nosplit out of an abundance of caution and omit the runtime.KeepAlive? The only danger is when a pointer is in a uintptr var, in which case it won't be updated if it points to a stack address that gets moved. So the only risk is if stack motion occurred in the middle of the calculation on line 31.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure let's do that.

return -1
}

return int(diff / size)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we might want to also check that diff % size == 0?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any risk that size might not include trailing padding bytes? In other words, is it possible that diff/size could return the wrong index and it should instead be something like diff/(size+padding)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sizeof is the stride size, so it has to contain padding. https://go.dev/play/p/vYH_8QPvMTW

Consider T = [2]byte. If p - s[0] lies outside of the range of s, we exit early. Otherwise we perform this division, so p is between s[0] and s[len(s)-1], inclusive (it is not one past, we test for that). For diff to be divisible by size, p must point to some s[n], and not, say, to &s[n] + 1. It cannot point to &s[n] + 1 because that would be a *[2]byte that does not point to an allocated [2]byte value (instead, it stradles two of them), which is not something Go permits. Thus, we don't need the % check.

Comment on lines 27 to 29
// KeepAlive escapes its argument, so this ensures that a and b have
// escaped to the heap and won't be moved.
runtime.KeepAlive([2]unsafe.Pointer{a, b})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm trying to think if we even need anything. Maybe //go:nosplit out of an abundance of caution and omit the runtime.KeepAlive? The only danger is when a pointer is in a uintptr var, in which case it won't be updated if it points to a stack address that gets moved. So the only risk is if stack motion occurred in the middle of the calculation on line 31.

@mcy mcy requested a review from jhump November 11, 2024 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants