-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated Doc for Intel XPU Profile #3013
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/3013
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ✅ No FailuresAs of commit a5408b1 with merge base b725032 (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Hi @louie-tsai! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at [email protected]. Thanks! |
5692f57
to
9eb8734
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes to profile_with_itt looks fine to me, but changes to profiler_recpie makes it really hard to read. Perhaps one should try to find an easier to use interface for collecting/analyzing profiler information from accelerators?
49771be
to
d2a7183
Compare
@malfet |
Sorry, I was not talking about this one, but rather about the following code: if device == 'cuda':
activities.append(ProfilerActivity.CUDA)
elif device == 'xpu':
activities.append(ProfilerActivity.XPU) As one can not compile PyTorch with CUDA and XPU at the same time, why introducing new enum rather than creating an alias(i.e. |
Hi @malfet , it doesn't seem to be something can be fixed in this tutorial. It involves code changes. We will see how to address this concern in 2.6 with a separate PR to change profiler code. |
@malfet : why? this should be technically possible if someone will install both CUDA and XPU stacks on the system. And might be convenient for some people (mostly for debug I guess) to build pytorch with both path enabled in a single build. |
@malfet |
d2a7183
to
c382e5d
Compare
###################################################################### | ||
# (Note: the first use of XPU profiling may bring an extra overhead.) | ||
|
||
###################################################################### | ||
# The resulting table output (omitting some columns): | ||
# | ||
# .. code-block:: sh | ||
# | ||
#------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ | ||
# Name Self XPU Self XPU % XPU total XPU time avg # of Calls | ||
# ------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ | ||
# model_inference 0.000us 0.00% 2.567ms 2.567ms 1 | ||
# aten::conv2d 0.000us 0.00% 1.871ms 93.560us 20 | ||
# aten::convolution 0.000us 0.00% 1.871ms 93.560us 20 | ||
# aten::_convolution 0.000us 0.00% 1.871ms 93.560us 20 | ||
# aten::convolution_overrideable 1.871ms 72.89% 1.871ms 93.560us 20 | ||
# gen_conv 1.484ms 57.82% 1.484ms 74.216us 20 | ||
# aten::batch_norm 0.000us 0.00% 432.640us 21.632us 20 | ||
# aten::_batch_norm_impl_index 0.000us 0.00% 432.640us 21.632us 20 | ||
# aten::native_batch_norm 432.640us 16.85% 432.640us 21.632us 20 | ||
# conv_reorder 386.880us 15.07% 386.880us 6.448us 60 | ||
# ------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ | ||
# Self CPU time total: 712.486ms | ||
# Self XPU time total: 2.567ms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What value does this extra table brings to the user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are indeed just minor changes including some different operators and also XPU instead of GPU in this table. just want people to understand how it might look like for the output.
c382e5d
to
b0fcfdd
Compare
7fe0506
to
a66e75f
Compare
@malfet |
To fix spelling, please add these to the en-wordlist.txt:
|
2f062bf
to
c3df122
Compare
Update profiler_recipe.py to unify the accelerators python codes
c3df122
to
ef7d0d3
Compare
addressed them accordingly. please also help to approve it if all look good to you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels a bit verbose to me, but looks fine.
Description
PyTorch Profiling changes for XPU.
Landing page:
https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html
https://pytorch.org/tutorials/recipes/profile_with_itt.html
Checklist
cc @gujinghui @EikanWang @fengyuan14 @guangyey