Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[perf] replace Math.max usage with reduce and pure compare #1520

Conversation

lifeart
Copy link
Contributor

@lifeart lifeart commented Dec 5, 2023

It seems reduce is 5 times faster for our case https://jsben.ch/Qosu7
image

Here is V8 Math.Max implementation: https://github.com/v8/v8/blob/cd81dd6d740ff82a1abbc68615e8769bd467f91e/src/js/math.js#L77-L102 (old)
https://github.com/v8/v8/blob/04f51bc70a38fbea743588e41290bea40830a486/src/builtins/math.tq#L135 (new)

image

It seems it uses more code paths than we actually need:

  • extra function call
  • count amount of arguments
  • cast first arg to number
  • cast second arg to number
  • compare arg1 and arg2

In our case both args already numbers and we could take 1 step instead of 5

If we assume every step is an opcode and there is no caching, for case where we have 10 tags + 10 subtags, new comparison should took 100 opcodes, and old (with Math.max) - 500. This numbers quite relative to numbers we see in micro-bench.


Likely need to check macro-bench to verify

@lifeart lifeart changed the title [perf] replace Math.max usage to reduce and pure compare [perf] replace Math.max usage with reduce and pure compare Dec 5, 2023
@lifeart lifeart force-pushed the perf-replace-math-max-with-reduce-and-pure-comare branch from 451b301 to e099bd9 Compare December 5, 2023 23:00
@lifeart
Copy link
Contributor Author

lifeart commented Dec 6, 2023

Here is a comparison results with #1515 (comment)

image

Last image - math.max + all changes from linked PR.

I can't see dramatic improvements in performance, but I see difference in results accuracy here:

Pasted Graphic 1

@NullVoxPopuli
Copy link
Contributor

Here is what I get on FireFox
your bench
image

me changing the input values
image

@NullVoxPopuli
Copy link
Contributor

thanks for running the benchmark and providing results / reasoning / research / etc! it helps a lot with understanding the why behind this change, rather than just the outcome

@NullVoxPopuli NullVoxPopuli merged commit efcbfb7 into glimmerjs:main Dec 20, 2023
5 checks passed
@lifeart lifeart deleted the perf-replace-math-max-with-reduce-and-pure-comare branch December 20, 2023 21:30
@victor-homyakov
Copy link

victor-homyakov commented Jan 25, 2024

@lifeart @NullVoxPopuli please note that the benchmark https://jsben.ch/Qosu7 has some problems.

  1. Looks like "Code block 3" has an error in implementation:
for (const tag of values) {
  result = Math.max(values[tag], result);
  //       ^ maybe Math.max(tag, result)
}

This may have some penalty in speed (actually negligible in my measurements). Anyways, you should pay more attention to the benchmarked code.

  1. The data under test is the series of increasing numbers from zero to 19 [0, 1, 2, ... 19, 0, 1, 2, ... 19, ...]. It is very predictable and may affect the benchmark results.
  2. The data creation is placed in the "boilerplate" part, which is part of the benchmark. Time for initialization of the array is added to every variant of code. Array initialization should be placed in the "setup" part, which isn't benchmarked.
  3. The way of array creation may affect performance. I.e. var values = []; ... values.push(...); (FixedArray with PACKED_SMI_ELEMENTS in V8) may behave differently from var values = new Array(5000); ... values[i] = ...; (FixedArray with HOLEY_SMI_ELEMENTS in V8). Benchmark should use the same creation method as in the original code.

Here is the improved benchmark: https://jsben.ch/eNWlG and its results:

Chrome 120 on Apple M1:
image

Safari 17.2 on Apple M1:
image

Firefox 122 on Apple M1:
image

@victor-homyakov
Copy link

Results for Windows 10 64-bit on AMD Ryzen 7 5800H CPU:

Chrome 121
Benchmark Chrome 121 Windows

Firefox 122
Benchmark Firefox 122 Windows

@NullVoxPopuli
Copy link
Contributor

seems like it's worth a PR, if you have the time! 🎉 nice work!

@victor-homyakov
Copy link

I'd say that the classical for(;;) loop with if() is the best choice taking into account all the platforms it may run on.

@victor-homyakov
Copy link

seems like it's worth a PR, if you have the time! 🎉 nice work!

tsconfig has "noUncheckedIndexedAccess": true and to pass type checking the loop should look like

for (let i = 0; i < subtag.length; i++) {
  const value = subtag[i]![COMPUTE]();
  revision = Math.max(value, revision);
}

or

for (let i = 0; i < subtag.length; i++) {
  const tag = subtag[i];
  if (tag !== undefined) {
    const value = tag[COMPUTE]();
    revision = Math.max(value, revision);
  }
}

Which one do you prefer?

Fortunately, both variants show close results in benchmark https://jsben.ch/we4qL on current versions of Chrome, Firefox, and Safari.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants