-
-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Opportunistically use dense iteration for archetypal iteration #14049
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
this pr also unblocks some compiler auto SIMD ,consider following code
|
accum = | ||
// SAFETY: Matched table IDs are guaranteed to still exist. | ||
let table = unsafe { self.tables.get(archetype.table_id()).debug_checked_unwrap() }; | ||
if table.entity_count() == archetype.len() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be worthwhile to add a comment explaining that this an optimization for hybrid queries where the archetype is actually dense
Does this fully fix #2144? I suppose not, since it only optimizes for_each. |
yeah , Optimizing |
Sounds good. Let's leave that to follow-up then. |
v.push(world.spawn(TableData(0.)).id()); | ||
} | ||
|
||
v.shuffle(&mut deterministic_rand()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
v.shuffle(&mut deterministic_rand());
I think this should have a comment to explain why you're doing it.
If I understand correctly you have archetypes
A: [TableData, SparseData]
B: [TableData]
Entities are added in this order:
e0: A
e1: B
e2: A
e3: B
...
So that the order of entities is
Table: [e0, e1, e2, ...]
A: [e0, e2, ..]
B: [e1, e3, ..]
Then if you just despawned all the entities of B in order, you would remove e1, then e3, etc.
Each despawn call would do a swap_remove
in both the archetype and the table.
So the order of entities in the table would be somewhat jumbled compared with the order in A, which is still [e0, e2, ...]
By adding the shuffling, you want to guarantee that their orders are very different; which maximimes the benefits of your PR (which keeps cache-locality by iterating through the table order, instead of the archetype order
Could you please explain why this code is faster?
In this case the entities should be iterated in the same order before/after your change, no? |
it would theoretically have the same cache locality. however, dense iteration could imply to the compiler that we are iterating over continuous memory which could enable automatic SIMD optimizations , This could potentially make the operation nearly 4 times faster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not an ECS expert, but the code quality overall looks fine.
Co-authored-by: Alice Cecile <[email protected]>
Co-authored-by: Christian Hughes <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me! Nice work.
…#14673) # Objective - follow of #14049 ,we could use it on our Parallel Iterator,this pr also unified the used function in both regular iter and parallel iterations. ## Performance  no performance regression for regular itertaion 3.5X faster in hybrid parallel iteraion,this number is far greater than the benefits obtained in regular iteration(~1.81) because mutable iterations on continuous memory can effectively reduce the cost of mataining core cache coherence
Thank you to everyone involved with the authoring or reviewing of this PR! This work is relatively important and needs release notes! Head over to bevyengine/bevy-website#1667 if you'd like to help out. |
Objective
Solution
for_each
style iteration.Performance
nearly 2x win in specific scenarios, and no performance degradation in other test cases.