Changed AutoVectorize to use return by value for better performance #1

gitpy · 2022-11-10T18:49:10Z

It's a proof of concept that shows that the auto vectorizer can create similar performance than the manual SSE and ISPC.

The issue with the return by ref is that the optimizer, is bad at making assumptions about memory. Also in general the optimizer is very good at widening, but doing so manually might irritate him more, because of it not being in the compilers canonicalized form for widening.

Also changed to -O1 for main.cpp. It is mostly required to get rid of tuple boilerplate. And return by value by itself often benefits from having optimizations for the Caller.
Also checked the ASM: AutoVectorize doesn't get inlined and gets no benefit this way.

The pull request "as is" isn't really made to be merged. More to show that the vectorizer goes a long way, when the code is structured the correct way for the optimizer.

My local numbers are:

SSE: 10.8873 ns average
  Total time for 100000 runs: 1088.73 μs
  ...

Autovectorize: 9.08702 ns average
  Total time for 100000 runs: 908.702 μs
  ...

ISPC: 10.2582 ns average
  Total time for 100000 runs: 1025.81 μs
  ...

It's a proof of concept that shows that the auto vectorizer can create similar performance than the manual SSE and ISPC. The issue with the return by ref is that the optimizer is bad at making assumptions about memory. Also in general the optimizer is very good at widening but doing so manually might irritate him more. Also changed to -O1 for main.cpp. It is mostly required to get rid of tuple boilerplate. And return by value by itself benefits from having optimizations for the Caller. Checked the ASM: AutoVectorize doesn't get inlined and gets no possible benefit this way.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changed AutoVectorize to use return by value for better performance #1

Changed AutoVectorize to use return by value for better performance #1

gitpy commented Nov 10, 2022

Changed AutoVectorize to use return by value for better performance #1

Are you sure you want to change the base?

Changed AutoVectorize to use return by value for better performance #1

Conversation

gitpy commented Nov 10, 2022