-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FilterX copy-on-write for mutable objects #330
Open
MrAnno
wants to merge
17
commits into
axoflow:main
Choose a base branch
from
MrAnno:filterx-cow
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,312
−474
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
MrAnno
force-pushed
the
filterx-cow
branch
2 times, most recently
from
October 14, 2024 00:25
427be8e
to
4a0223c
Compare
Signed-off-by: László Várady <[email protected]>
Signed-off-by: László Várady <[email protected]>
Signed-off-by: László Várady <[email protected]>
Signed-off-by: László Várady <[email protected]>
Signed-off-by: László Várady <[email protected]>
Needed for introducing FilterXRef. Signed-off-by: László Várady <[email protected]>
I've written a longer PR description. The diagram may be hard to understand due to the lack of context, but we can talk about it IRL. |
Deep-copy assignment (the `.something = `, `["something"] = `, `something =`, `something +=` operators) is an essential feature of the FilterX language, making it easy to trace and understand how values change. However, copy (especially deep-copy) is expensive, so this PR aims to fix the performance bottleneck by introducing copy-on-write for mutable objects. To achieve CoW semantics without introducing endless changes in the `filterx` folder, `FilterXRef` has been introduced. References are currently not part of the FilterX language (hopefully, they never will be). `FilterXRef` is used only internally to reference the same `FilterXObject` from multiple locations (variables, other types of assignments). For this reason, `FilterXRef` pretends to be a `FilterXObject`, but it's not a real FilterX type. `FilterXRef` transparently propagates operations (virtual function calls) to the referenced value, but it also makes sure that the value is copied on the first write operation in case multiple references exist. `FilterXRef` is created only for mutable objects, and only when they are needed: around assignment-like operators; this and its `FilterXObject` behavior was the key to avoid polluting all code paths with references. Signed-off-by: László Várady <[email protected]>
Signed-off-by: László Várady <[email protected]>
The function will be extended with FilterXRef-related functionality, which requires a separate unit to avoid circular dependency between FilterX objects and refs. Signed-off-by: László Várady <[email protected]>
FilterXRef can mimic the functionality of its referenced value, but only through virtual methods. filterx_object_is_type() is usually followed by an implicit type cast, which is impossible to implement transparently (without the unwrap() call), so this safety check has been added. This is far from ideal, but doing downcasts is the original and real problem here. Signed-off-by: László Várady <[email protected]>
Signed-off-by: László Várady <[email protected]>
"lhs_object" is "self". Signed-off-by: László Várady <[email protected]>
`FilterXRef` is created only for mutable objects, and only when they are needed: around assignment-like operators; this and its `FilterXObject` behavior is the key to avoid polluting all code paths with references. Full CoW on subscript and attr operators (`.something`, `["something"]`) of JSON objects is currently not implemented. I've tried to make it work, but it requires a complete redesign of how we cache JSON subobjects (or alternatively, we need an internal representation for dicts and lists without using the json-c library). This will be implemented later. For now, such operators make unnecessary copies (CoW only partially works on them). Signed-off-by: László Várady <[email protected]>
FilterXRef can mimic the functionality of its referenced value, but only through virtual methods. Where (implicit) downcasts are used, it is impossible to be transparent about references, so these unwrap methods have to be used. This is far from ideal, but doing downcasts based on is_type() is the original and real problem here. In such cases, `filterx_ref_unwrap()` must be called to extract the original object. The extracted object must only be used locally (short-term, without referencing or storing it anywhere). Signed-off-by: László Várady <[email protected]>
Signed-off-by: László Várady <[email protected]>
Signed-off-by: László Várady <[email protected]>
Signed-off-by: László Várady <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Depends on #346
Copy-on-write and
FilterXRef
Deep-copy assignment (the
.something =
,["something"] =
,something =
,something +=
operators) is an essential feature of the FilterX language, making it easy to trace and understand how values change.However, copy (especially deep-copy) is expensive, so this PR aims to fix the performance bottleneck by introducing copy-on-write for mutable objects.
To achieve CoW semantics without introducing endless changes in the
filterx
folder,FilterXRef
has been introduced.References are currently not part of the FilterX language (hopefully, they never will be).
FilterXRef
is used only internally to reference the sameFilterXObject
from multiple locations (variables, other types of assignments).For this reason,
FilterXRef
pretends to be aFilterXObject
, but it's not a real FilterX type.FilterXRef
transparently propagates operations (virtual function calls) to the referenced value, but it also makes sure that the value is copied on the first write operation in case multiple references exist.FilterXRef
is created only for mutable objects, and only when they are needed: around assignment-like operators;this and its
FilterXObject
behavior was the key to avoid polluting all code paths with references.Implementation concerns
Unfortunately,
FilterXRef
can only be transparent when virtual calls are used.We use a lot of
downcasting
andisinstanceof()
calls for logical and performance reasons.Where explicit downcasting is needed,
filterx_ref_unwrap()
must be called to extract the original object. The extracted object must only be used locally (short-term, without referencing or storing it anywhere).This is ugly and hard to maintain, but downcasting is the original problem here. A safety check has been added to
filterx_object_is_type()
in order to detect if we forget to callunwrap()
.(And since
FilterXRef
is only for mutable types, I could avoid polluting the codebase with those calls where immutable downcasts were used.)Everything sounds simple in theory, but debugging this with GDB is a nightmare if you try to catch a bug.
Current limitations
Full CoW on subscript and attr operators (
.something
,["something"]
) of JSON objects is currently not implemented.I've tried to make it work, but it requires a complete redesign of how we cache JSON subobjects (or alternatively, we need an internal representation for dicts and lists without using the json-c library). This will be implemented later.
For now, such operators make unnecessary copies (CoW only partially works on them).