Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DeivceASAN] Make ShadowMemory one instance per type #2585

Merged

Conversation

yingcong-wu
Copy link
Contributor

@yingcong-wu yingcong-wu commented Jan 20, 2025

Previously we had one ShadowMemory for every DeviceInfo. That is causing problem because it's release process is called too late. And only the first ShadowMemory would do the actual shadow memory bookkeeping.

This is a quick fix but also is heading in the direction we plan, which is one ShadowMemory for each device type. This relies on the fact that for the virtual memory with L0, it can access the address from different device and within different context with newer driver. And as for opencl CPU device, since it is host memory, it does not have to worry about this issue.

We will purpose PR later to do a better refactor, but for now we are trying to fix the issue with minimum changes to avoid any possible regression.

Intel/llvm PR: intel/llvm#16687

@yingcong-wu yingcong-wu force-pushed the yc-test-main/0120-on-shadow-per-type branch from 24f239c to c2aa7cd Compare January 20, 2025 07:19
@yingcong-wu yingcong-wu marked this pull request as ready for review January 20, 2025 07:20
@yingcong-wu yingcong-wu requested a review from a team as a code owner January 20, 2025 07:20
@yingcong-wu
Copy link
Contributor Author

yingcong-wu commented Jan 20, 2025

Hi @unified-runtime-maintain, this patch is targeting v0.11, please help tag this PR.

@@ -36,8 +36,7 @@ AsanInterceptor::~AsanInterceptor() {
// We must release these objects before releasing adapters, since
// they may use the adapter in their destructor
for (const auto &[_, DeviceInfo] : m_DeviceMap) {
[[maybe_unused]] auto URes = DeviceInfo->Shadow->Destory();
assert(URes == UR_RESULT_SUCCESS);
DeviceInfo->Shadow = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clear the "m_DeviceMap" directly?

Copy link
Contributor Author

@yingcong-wu yingcong-wu Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that should also work. But we aim to do minimum change here, so I would respectful reject this advice. We would have follow-up PR to do a better job at the refactoring.

@wenju-he
Copy link
Contributor

called to late.

typo: too late

@yingcong-wu
Copy link
Contributor Author

called to late.

typo: too late

Updated, thanks.

@yingcong-wu yingcong-wu requested a review from AllanZyne January 20, 2025 08:30
@yingcong-wu yingcong-wu marked this pull request as draft January 20, 2025 08:48
@pbalcer pbalcer added the v0.11.x Include in the v0.11.x release label Jan 20, 2025
@yingcong-wu
Copy link
Contributor Author

yingcong-wu commented Jan 20, 2025

Thank you @pbalcer , we will hold this PR for now to discuss an issue during our meeting tomorrow. Will mark this as ready when there is a conclusion.

@yingcong-wu yingcong-wu marked this pull request as ready for review January 21, 2025 05:16
Copy link
Contributor

@zhaomaosu zhaomaosu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@yingcong-wu
Copy link
Contributor Author

yingcong-wu commented Jan 21, 2025

Some CI tasks fail at apt install stage. Suspecting infrastructure problem.

@pbalcer
Copy link
Contributor

pbalcer commented Jan 21, 2025

Some CI tasks fail at apt install stage. Suspecting infrastructure problem.

Yeah, looks like Intel infra issue. Seems to be resolved now.

@yingcong-wu
Copy link
Contributor Author

Hi @pbalcer , CI tasks are all good now. Could you please help merge this PR?

@pbalcer pbalcer added the ready to merge Added to PR's which are ready to merge label Jan 21, 2025
@kbenzie kbenzie merged commit 262ec93 into oneapi-src:main Jan 21, 2025
75 checks passed
steffenlarsen pushed a commit to intel/llvm that referenced this pull request Jan 22, 2025
kbenzie added a commit to kbenzie/unified-runtime that referenced this pull request Jan 22, 2025
…-on-shadow-per-type

[DeivceASAN] Make ShadowMemory one instance per type
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
loader Loader related feature/bug ready to merge Added to PR's which are ready to merge v0.11.x Include in the v0.11.x release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants