Experimental dpctl support for native_cpu
device
#2051
Unanswered
ndgrigorian
asked this question in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The oneAPI DPC++ compiler now has experimental support for a "native CPU" device, which treats the host CPU as a "first-class citizen."
This discussion is meant both to explore the use of
native_cpu
devices, and to provide convenient instructions on how to start withdpctl
andnative_cpu
targets.OS: Ubuntu 24.04 Noble
CPU: 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
Initial step
I first created a Conda environment containing the requirements for building dpctl (see documentation)
Setting up the compiler
I first cloned the DPC++ compiler from the Github repo for the oneAPI DPC++ compiler. With my local copy, I read through the documentation and (after one failed experiment) found that I could successfully build the compiler and begin to build dpctl, using the following
from the repo root, I ran
python buildbot/configure.py --native-cpu --llvm-external-projects="lld"
having found that without
--llvm-external-projects="lld"
dpctl would fail to build citinglld
as being at fault.After configuring, I ran
this took quite awhile, but it did succeed, creating in
/path/to/repo/llvm/build/install
the built compiler, withclang
andclang++
inbin
. I also verified that the UR adapter fornative_cpu
was present inlib
.Building dpctl
With the compiler built, I then set up the environment similarly to building dpctl with the nightly compiler, getting other dependencies
then set up
LD_LIBRARY_PATH
andPATH
, similar to nightly buildsI then ran this to verify
sycl-ls
showednative_cpu
deviceand it worked!
Now I ran
...this worked, to a point. The
_tensor_linalg
sub-module failed to build and, after it, the_tensor_sorting
sub-module. I commented both of these out.This eventually succeeded, though warnings were thrown for a significant amount of math functions, like some trig functions,
log1pf
, etc.After this, it was possible to import dpctl and run
dpctl.lsplatform(2)
and seeSYCL_NATIVE_CPU
as a platformImplementing
native_cpu
in dpctldpctl.get_devices()
would ignore thenative_cpu
device because it hadn't been hooked up in dpctl's machinery. So I made adjustments to enable itAnd it's a success! Some kernels can even be run with it
Public branch
To experiment with this, the branch
experimental/support-native-cpu-device
has been made available, commenting out the failing sub-modules and implementingnative_cpu
in the machinery.Beta Was this translation helpful? Give feedback.
All reactions