Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add aten::erfinv, aten::exp2, aten::expm1, aten::exponential_ #527

Merged
merged 22 commits into from
Jul 20, 2024

Conversation

hjhee
Copy link
Contributor

@hjhee hjhee commented Jul 2, 2024

  • erfinv
  • erfinv_
  • erfinv.out
  • exp2
  • exp2_
  • exp2.out
  • expm1
  • expm1_
  • expm1.out
  • exponential_

@fengyuan14 fengyuan14 changed the title add aten::erfinv, aten::exp2, aten::expm1, aten::exponential_ Add aten::erfinv, aten::exp2, aten::expm1, aten::exponential_ Jul 17, 2024
@fengyuan14
Copy link
Contributor

Please check the failure. Most likely we have different handle between GCC and SYCL compiler in std::exp2 for std::complex.
image

@hjhee
Copy link
Contributor Author

hjhee commented Jul 18, 2024

Handling logic for CPU/XPU is not consistent for exp2:

cpu_results
tensor([nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj,
        nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj,
        nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj,
        nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj,
        nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj,
        nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj])
cuda_results
tensor([inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj,
        inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, 0.+0.j,
        0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j,
        0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, nan+nanj, nan+nanj,
        nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj,
        nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj],
       device='xpu:0')
cpu_sample.input
tensor([inf+infj, inf+infj, inf+infj, inf+infj, inf+infj, inf+infj, inf+infj, inf+infj, inf+infj, inf-infj, inf-infj, inf-infj, inf-infj, inf-infj,
        inf-infj, inf-infj, inf-infj, inf-infj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, -inf+infj,
        -inf+infj, -inf+infj, -inf+infj, -inf+infj, -inf+infj, -inf+infj, -inf+infj, -inf+infj, -inf-infj, -inf-infj, -inf-infj, -inf-infj, -inf-infj, -inf-infj,
        -inf-infj, -inf-infj, -inf-infj, -inf+nanj, -inf+nanj, -inf+nanj, -inf+nanj, -inf+nanj, -inf+nanj, -inf+nanj, -inf+nanj, -inf+nanj, nan+infj, nan+infj,
        nan+infj, nan+infj, nan+infj, nan+infj, nan+infj, nan+infj, nan+infj, nan-infj, nan-infj, nan-infj, nan-infj, nan-infj, nan-infj, nan-infj,
        nan-infj, nan-infj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj])

@fengyuan14
Copy link
Contributor

Handling logic for CPU/XPU is not consistent for exp2:

cpu_results
tensor([nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj,
        nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj,
        nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj,
        nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj,
        nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj,
        nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj])
cuda_results
tensor([inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj,
        inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, 0.+0.j,
        0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j,
        0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, nan+nanj, nan+nanj,
        nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj,
        nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj],
       device='xpu:0')
cpu_sample.input
tensor([inf+infj, inf+infj, inf+infj, inf+infj, inf+infj, inf+infj, inf+infj, inf+infj, inf+infj, inf-infj, inf-infj, inf-infj, inf-infj, inf-infj,
        inf-infj, inf-infj, inf-infj, inf-infj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, inf+nanj, -inf+infj,
        -inf+infj, -inf+infj, -inf+infj, -inf+infj, -inf+infj, -inf+infj, -inf+infj, -inf+infj, -inf-infj, -inf-infj, -inf-infj, -inf-infj, -inf-infj, -inf-infj,
        -inf-infj, -inf-infj, -inf-infj, -inf+nanj, -inf+nanj, -inf+nanj, -inf+nanj, -inf+nanj, -inf+nanj, -inf+nanj, -inf+nanj, -inf+nanj, nan+infj, nan+infj,
        nan+infj, nan+infj, nan+infj, nan+infj, nan+infj, nan+infj, nan+infj, nan-infj, nan-infj, nan-infj, nan-infj, nan-infj, nan-infj, nan-infj,
        nan-infj, nan-infj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj, nan+nanj])

We have met such kind of case in other std operators. We have different behavior for std::complex, which contains denormal value (nan or inf).

@fengyuan14
Copy link
Contributor

When input is inf+infj, as the C++ standard, the output should be +-inf+nanj. XPU result should align with standard, but CPU result gets nan+nanj.
image

@fengyuan14 fengyuan14 added this pull request to the merge queue Jul 20, 2024
Merged via the queue into main with commit 2258cb4 Jul 20, 2024
2 checks passed
@fengyuan14 fengyuan14 deleted the hjhee/erfinv branch July 20, 2024 13:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants