Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BugFix: Agent ID's were not static during host functions. #1270

Merged
merged 5 commits into from
Jan 7, 2025

Conversation

Robadob
Copy link
Member

@Robadob Robadob commented Dec 10, 2024

Fixes two bugs found by @ptheywood (reported via Slack, no GitHub issue), and introduces 3 new tests that covers the former issues.

Bug 1

If agents were created during a host function, and then accessed via a DeviceAgentVector in the same host function (layer). The internal transfer from host agent creation storage to DeviceAgentVector would see their IDs regenerated. This bug was also present if performed across multiple init functions, as host agent creation is only performed at the end of the init "layer".

Resolution 1

DeviceAgentVector_impl::_insert() was always assigning IDs, the check for ID already being set was commented out as assumed redundant(?). So it was uncommented, which resolved the failing test case.

Bug 2

In the same edge-case as Bug 1, only the agent's ID variable would be sync'd to device.

Resolution 2

Inside DeviceAgentVector_impl::_insert(), the change notifications for all variables were only set at the very end of the method, however if unbound_buffer.empty() then it would exit before this was reached. So I moved the change notification up, also removed the specific ID change notification as this should be covered with all variables.


The full test suite still passes (Windows/Debug/SEATBELTS=ON)

[==========] 1118 tests from 86 test suites ran. (530210 ms total)
[  PASSED  ] 1118 tests.

  YOU HAVE 53 DISABLED TESTS

If agents were created via host agent creation, then accessed via DeviceAgentVector, their HostAgentCreation assigned IDs would be replaced.
@Robadob Robadob added the bug label Dec 10, 2024
@Robadob Robadob self-assigned this Dec 10, 2024
@Robadob Robadob requested a review from ptheywood December 10, 2024 21:46
@ptheywood
Copy link
Member

This addresses my original ID issue, however, I've also encountered a related issues where if agents are created and mutated in the same init function, data is then not correctly set in device/step function.

I'll add / amend a test to reproduce this later today.

Robadob and others added 4 commits December 11, 2024 21:24
Spotted it's comment was a copy-paste mistake

Also remove redundant double initialisation.
Which cover other potential use cases, by combining the 2 init functions into one, and ensuring that the device see's the updated data (both id and id_copy)
Copy link
Member

@ptheywood ptheywood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New tests (and all others) pass under linux (CUDA 12.0, Release, SEATBELTS=ON)

[==========] 1131 tests from 87 test suites ran. (176509 ms total)
[  PASSED  ] 1131 tests.

  YOU HAVE 40 DISABLED TESTS

Ta for finding the fix (and fixing the broken test I added)

@ptheywood ptheywood merged commit a31013d into master Jan 7, 2025
18 of 22 checks passed
@ptheywood ptheywood deleted the bugfix_2024_12_10 branch January 7, 2025 11:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants