-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xpu/ocl: Minor generic-vendor fixes #2084
Open
nwnk
wants to merge
4
commits into
oneapi-src:main
Choose a base branch
from
nwnk:xpu-ocl-generic
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 1 commit
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that getting the intel platform checks out of
xpu::ocl
makes sense, but we still need some filtering to prevent non-intel platforms from dispatching into the code insrc/gpu/intel
. Because of this, it seems like we need to add a check for the intel platform before the call to gpu::intel::ocl::engine_create.In addition, I don't think we can enable the non-intel runtimes to dispatch into this code either as it is not currently supported and there is no testing infrastructure to ensure it works. We will need an RFC on how multiple runtimes should be supported before that can be considered. I get that this functionality is useful for your exploration of oneDNN though, maybe we can add a development variable guard in
gpu::intel::ocl::engine_create
(similar to here, and documented here) for now?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_devices
returns only intel devices and therefore it will affect all Intel-specific GPU code that relies on that.get_devices
to filter out devices and platforms as it has to be done somewhere anyway. Theis_intel_platform
function merely checks the name of the platform so it doesn't add any dependency on Intel-specific API.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rjoursler:
I like that. Would it make sense to then remove the consistency check in
check_device
? That's currently the thing preventing us from binding to rusticl on otherwise-Intel hardware, so it needs to be kept somewhere, but I think if we assertis_intel_platform
beforeengine_create
we wouldn't need to verify it later.As of 7cfc834 (a few days old, just happens to be checked out), building the OCL engine for the GENERIC vendor does not produce a working
libdnnl.so
:Even if it would link, the engine factory would only call the rest of the engine ctor if we were built for INTEL, and GENERIC will just throw a runtime error. All of which to say: I agree it ought to be tested once it's enabled, but that configuration is still pretty unreachable.
@densamoilov:
Meh? I don't think that's an especially good contract, certainly not in code that doesn't have
intel
in the pathname, and (per the above) it can't matter yet as the GENERIC configuration does not reachclCreateContext
. In the supported INTEL configuration we will still only initialize oncl_device
s from the Intel platform, so the only visible change is that the device array would have more things in it. But (afaict) at runtime we only really operate in terms of engines not devices. So it's hard for me to see that there is any reliance on that invariant.If the list of devices needs to be filtered then the place to do it is in the place that wants the filtered view. If the GENERIC configuration can be made to work, I could easily imagine something like rusticl for local work generation plus pocld to distribute long-lived work units. That means I will want a list of devices from more than one OpenCL platform, from all of them in fact. And unfortunately the "device index" needs to be unique across the engine kind which means if I want to talk to two OpenCL platforms then I need to assign globally unique indexes to their devices anyway. I may as well just collect them all here, and (per the above) apply the platform filter before finishing device init.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point is, the contract exists and therefore if you want to change it you will have to adjust the code that relies on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For an example, consider this location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That check looks like it doesn't belongs in
xpu::ocl
anyway. It should likely be moved to intel::ocl::ocl_gpu_engine_t::init() and use VCONDCHECK to log the creation failure. With this design, there is no need to guard the call toengine_create
either, as creation failure will be enforced by theintel::ocl_gpu_engine_t
structure.