use fastdoubleparser to improve double parsing speed #54

rudygt · 2022-11-08T17:04:39Z

Description of changes:

as part of the latest updated jackson-databind that recently uptaked the fast double parser into it, we can now take advantage of this to improve our double parsing speed (that we happens very often).

more information about the performance gains from the faster double parser can be found on the original issue at FasterXML/jackson-core#577 (comment)

I can see that in my machine this change helps with the performance, so hopefully we can get some more information from other people results and decide if we want to merge.

thanks!.

Benchmark / Performance:

Original

Reading citylots2
Read 213068 events
EXACT events/sec: 242674.3
WILDCARD events/sec: 190579.6
PREFIX events/sec: 275994.8
SUFFIX events/sec: 278520.3
EQUALS_IGNORE_CASE events/sec: 247465.7
NUMERIC events/sec: 154062.2
ANYTHING-BUT events/sec: 161171.0
COMPLEX_ARRAYS events/sec: 5052.0
PARTIAL_COMBO events/sec: 109942.2
COMBO events/sec: 3000.0
Reading citylots2
Read 213068 events
Finding Rules...
Lots: 10000
Lots: 20000
Lots: 30000
Lots: 40000
Lots: 50000
Lots: 60000
Lots: 70000
Lots: 80000
Lots: 90000
Lots: 100000
Lots: 110000
Lots: 120000
Lots: 130000
Lots: 140000
Lots: 150000
Lots: 160000
Lots: 170000
Lots: 180000
Lots: 190000
Lots: 200000
Lots: 210000
Lines: 213068, Msec: 10197
Events/sec: 20895.2
 Rules/sec: 146266.2
Before: 225.2
After: 3056.0
Per rule: -7077
Turning JSON into field-lists...
Finding Rules...
Lines: 213068, Msec: 2011
Events/sec: 105951.3
Before: 236.5
After: 1278.9
Per rule: -2605
Reading lines...
Finding Rules...
Lots: 10000
Lots: 20000
Lots: 30000
Lots: 40000
Lots: 50000
Lots: 60000
Lots: 70000
Lots: 80000
Lots: 90000
Lots: 100000
Lots: 110000
Lots: 120000
Lots: 130000
Lots: 140000
Lots: 150000
Lots: 160000
Lots: 170000
Lots: 180000
Lots: 190000
Lots: 200000
Lots: 210000
Lines: 213068, Msec: 1340
Events/sec: 159006.0
 Rules/sec: 579099743.3
Reading citylots2
Read 213068 events
Lots: 10000
Lots: 20000
Lots: 30000
Lots: 40000
Lots: 50000
Lots: 60000
Lots: 70000
Lots: 80000
Lots: 90000
Lots: 100000
Lots: 110000
Lots: 120000
Lots: 130000
Lots: 140000
Lots: 150000
Lots: 160000
Lots: 170000
Lots: 180000
Lots: 190000
Lots: 200000
Lots: 210000
Matched: 52527
Lines: 213068, Msec: 11256
Events/sec: 18929.3
Reading lines...
Finding Rules...
Lots: 10000
Lots: 20000
Lots: 30000
Lots: 40000
Lots: 50000
Lots: 60000
Lots: 70000
Lots: 80000
Lots: 90000
Lots: 100000
Lots: 110000
Lots: 120000
Lots: 130000
Lots: 140000
Lots: 150000
Lots: 160000
Lots: 170000
Lots: 180000
Lots: 190000
Lots: 200000
Lots: 210000
Lines: 213068, Msec: 9501
Events/sec: 22425.8
 Rules/sec: 156980.9
DEEP EXACT events/sec: 14285.7

Changes

Reading citylots2
Read 213068 events
EXACT events/sec: 247465.7
WILDCARD events/sec: 184155.6
PREFIX events/sec: 282583.6
SUFFIX events/sec: 274218.8
EQUALS_IGNORE_CASE events/sec: 235434.3
NUMERIC events/sec: 154845.9
ANYTHING-BUT events/sec: 160806.0
COMPLEX_ARRAYS events/sec: 6088.0
PARTIAL_COMBO events/sec: 108874.8
COMBO events/sec: 3660.5
Reading citylots2
Read 213068 events
Finding Rules...
Lots: 10000
Lots: 20000
Lots: 30000
Lots: 40000
Lots: 50000
Lots: 60000
Lots: 70000
Lots: 80000
Lots: 90000
Lots: 100000
Lots: 110000
Lots: 120000
Lots: 130000
Lots: 140000
Lots: 150000
Lots: 160000
Lots: 170000
Lots: 180000
Lots: 190000
Lots: 200000
Lots: 210000
Lines: 213068, Msec: 10301
Events/sec: 20684.2
 Rules/sec: 144789.4
Before: 211.9
After: 2406.3
Per rule: -5485
Turning JSON into field-lists...
Finding Rules...
Lines: 213068, Msec: 2098
Events/sec: 101557.7
Before: 245.3
After: 1352.7
Per rule: -2768
Reading lines...
Finding Rules...
Lots: 10000
Lots: 20000
Lots: 30000
Lots: 40000
Lots: 50000
Lots: 60000
Lots: 70000
Lots: 80000
Lots: 90000
Lots: 100000
Lots: 110000
Lots: 120000
Lots: 130000
Lots: 140000
Lots: 150000
Lots: 160000
Lots: 170000
Lots: 180000
Lots: 190000
Lots: 200000
Lots: 210000
Lines: 213068, Msec: 1381
Events/sec: 154285.3
 Rules/sec: 561907064.4
Reading citylots2
Read 213068 events
Lots: 10000
Lots: 20000
Lots: 30000
Lots: 40000
Lots: 50000
Lots: 60000
Lots: 70000
Lots: 80000
Lots: 90000
Lots: 100000
Lots: 110000
Lots: 120000
Lots: 130000
Lots: 140000
Lots: 150000
Lots: 160000
Lots: 170000
Lots: 180000
Lots: 190000
Lots: 200000
Lots: 210000
Matched: 52527
Lines: 213068, Msec: 10224
Events/sec: 20840.0
Reading lines...
Finding Rules...
Lots: 10000
Lots: 20000
Lots: 30000
Lots: 40000
Lots: 50000
Lots: 60000
Lots: 70000
Lots: 80000
Lots: 90000
Lots: 100000
Lots: 110000
Lots: 120000
Lots: 130000
Lots: 140000
Lots: 150000
Lots: 160000
Lots: 170000
Lots: 180000
Lots: 190000
Lots: 200000
Lots: 210000
Lines: 213068, Msec: 9186
Events/sec: 23194.9
 Rules/sec: 162364.0
DEEP EXACT events/sec: 11111.1

baldawar · 2022-11-08T22:08:47Z

Thanks @rudygt . On

I can see that in my machine this change helps with the performance, so hopefully we can get some more information from other people results and decide if we want to merge.

I'll run this on my laptop tomorrow and confirm if I see the same results

baldawar · 2022-11-08T22:09:02Z

src/main/software/amazon/event/ruler/ByteMachine.java

@@ -85,7 +86,7 @@ Set<NameState> transitionOn(String valString) {
        boolean fieldValueIsNumeric = false;
        if (hasNumeric.get() > 0) {
            try {
-                final double numerically = Double.parseDouble(valString);
+                final double numerically = parseDouble(valString);


would prefer to follow Class.staticMethod(...) pattern here.

ook, I have changed the import

timbray · 2022-11-08T22:12:43Z

I found the presentation of the benchmark results in the PR a little hard to figure out. My first cut, when exploring the impact of a chance, is just to run the RealisticCityLots() benchmark. It’s not perfect but it does stress out the flattening process and is actually a pretty good test if your changes are in the Event JSON parsing.

baldawar · 2022-11-10T02:00:37Z

I found the presentation of the benchmark results in the PR a little hard to figure out.

I'm looking at making benchmarks more readable and to fail in case regressions sneak in. For this change I think NUMERIC events/sec: 154845.9 would be the place to look. Also, any other rule patterns that require numeric matching (like in original citylots and citylots2 tests)

I can see that in my machine this change helps with the performance, so hopefully we can get some more information from other people results and decide if we want to merge.

I'll run this on my laptop tomorrow and confirm if I see the same results

Looks like this doesn't lead to meaningful difference. Based on the originating paper https://arxiv.org/pdf/2101.11408.pdf seems like we'd see some benefits but I think the rest of the code makes a significant impact. Will be checking in profiler later this week.

timbray · 2022-11-10T02:21:00Z

I'm looking at making benchmarks more readable and to fail in case regressions sneak in. For this change I think NUMERIC events/sec: 154845.9 would be the place to look. Also, any other rule patterns that require numeric matching (like in original citylots and citylots2 tests)

I'm curious: Why numeric? Not necessarily disagreeing but wondering why.

rudygt · 2022-11-10T08:57:18Z

I think the change only comes into play when there are a lot of ComparableNumbers being created, during the transitions on the ByteMachine, I think calling it parsing maybe was confusing because it has nothing to do with the json parsing speed?.

baldawar · 2022-11-10T16:53:40Z

src/main/software/amazon/event/ruler/ByteMachine.java

@@ -85,7 +86,7 @@ Set<NameState> transitionOn(String valString) {
        boolean fieldValueIsNumeric = false;
        if (hasNumeric.get() > 0) {
            try {
-                final double numerically = Double.parseDouble(valString);
+                final double numerically = NumberInput.parseDouble(valString);


Looks like ComparableNumber is the bottleneck. I'm not seeing consistent improvement w this change on my laptop, even when I directly use FastDoubleParser.parseDouble(s).

Same is true in Quamina BTW. That and the Sprintf.

@rudygt just catching up back on this thread, do you think its' still worth merging this in?

sorry for letting it hang, I would like to do some isolated benchmarks but currently on vacation, so if we can keep it open for another 20 days or so, if not feel free to close, and I will re open if my tests reveal good results.

no worries. I'll leave this open. enjoy your time-off !

rudygt · 2022-12-27T12:28:05Z

Hello again @baldawar

I think I found the issue, I was not calling the correct method in the first version of the pull.

Also I added a more formal benchmark to compare the changes and with java 11 I see good improvements.

this is the output of running ComparableNumberBenchmark in my setup, also I did a very similar empirical test as the one we used to validate the fast hex to long changes a while ago in ComparableNumberTest.benchmarkDoubleParsers

Let me know if how the numbers look like for you, and from there we could see if it makes sense to merge.

Thanks.

# JMH version: 1.36
# VM version: JDK 11.0.7, OpenJDK 64-Bit Server VM, 11.0.7+10-LTS
# VM invoker: C:\Program Files\Amazon Corretto\jdk11.0.7_10\bin\java.exe
# VM options: -javaagent:C:\Program Files\JetBrains\IntelliJ IDEA 2020.1.2\lib\idea_rt.jar=52940:C:\Program Files\JetBrains\IntelliJ IDEA 2020.1.2\bin -Dfile.encoding=UTF-8
# Blackhole mode: full + dont-inline hint (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 5 iterations, 2 s each
# Measurement: 5 iterations, 5 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: software.amazon.event.ruler.ComparableNumberBenchmark.candidate

# Run progress: 0.00% complete, ETA 00:01:10
# Fork: 1 of 1
# Warmup Iteration   1: 13574.493 ops/s
# Warmup Iteration   2: 15092.026 ops/s
# Warmup Iteration   3: 15172.805 ops/s
# Warmup Iteration   4: 15172.074 ops/s
# Warmup Iteration   5: 15059.023 ops/s
Iteration   1: 15064.241 ops/s
Iteration   2: 15035.477 ops/s
Iteration   3: 14821.209 ops/s
Iteration   4: 14753.557 ops/s
Iteration   5: 14920.381 ops/s


Result "software.amazon.event.ruler.ComparableNumberBenchmark.candidate":
  14918.973 ±(99.9%) 515.175 ops/s [Average]
  (min, avg, max) = (14753.557, 14918.973, 15064.241), stdev = 133.789
  CI (99.9%): [14403.798, 15434.148] (assumes normal distribution)


# JMH version: 1.36
# VM version: JDK 11.0.7, OpenJDK 64-Bit Server VM, 11.0.7+10-LTS
# VM invoker: C:\Program Files\Amazon Corretto\jdk11.0.7_10\bin\java.exe
# VM options: -javaagent:C:\Program Files\JetBrains\IntelliJ IDEA 2020.1.2\lib\idea_rt.jar=52940:C:\Program Files\JetBrains\IntelliJ IDEA 2020.1.2\bin -Dfile.encoding=UTF-8
# Blackhole mode: full + dont-inline hint (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 5 iterations, 2 s each
# Measurement: 5 iterations, 5 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: software.amazon.event.ruler.ComparableNumberBenchmark.original

# Run progress: 50.00% complete, ETA 00:00:35
# Fork: 1 of 1
# Warmup Iteration   1: 4738.090 ops/s
# Warmup Iteration   2: 5253.218 ops/s
# Warmup Iteration   3: 5112.347 ops/s
# Warmup Iteration   4: 5322.917 ops/s
# Warmup Iteration   5: 5315.814 ops/s
Iteration   1: 5355.783 ops/s
Iteration   2: 5344.274 ops/s
Iteration   3: 5506.612 ops/s
Iteration   4: 5568.696 ops/s
Iteration   5: 5572.263 ops/s


Result "software.amazon.event.ruler.ComparableNumberBenchmark.original":
  5469.526 ±(99.9%) 432.194 ops/s [Average]
  (min, avg, max) = (5344.274, 5469.526, 5572.263), stdev = 112.239
  CI (99.9%): [5037.332, 5901.719] (assumes normal distribution)


# Run complete. Total time: 00:01:11

REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.

Benchmark                             Mode  Cnt      Score     Error  Units
ComparableNumberBenchmark.candidate  thrpt    5  14918.973 ± 515.175  ops/s
ComparableNumberBenchmark.original   thrpt    5   5469.526 ± 432.194  ops/s

Process finished with exit code 0

baldawar · 2023-01-03T19:33:47Z

src/test/software/amazon/event/ruler/ComparableNumberTest.java

@@ -139,4 +139,38 @@ public void epocTimestampRangeTest() throws Exception {
        }
        assertTrue(m.isEmpty());
    }
+
+    @Test()
+    @Ignore()


Should this be removed?

I was thinking we dont really need this test merged, my intention was only to validate the new code was performing better, but once we make the change I dont see the need to keep this benchmark at all

Yup. I don't see value in this test at the moment. Happy to take it out.

soo I took the test out, also fixe the issue with the import and now I think we are good, Thanks.

https://media2.giphy.com/media/6bJaVXMavwUTGZHFzf/giphy480p.mp4

baldawar · 2023-01-03T19:39:25Z

src/test/software/amazon/event/ruler/ComparableNumberBenchmark.java

+@State(Scope.Benchmark)
+public class ComparableNumberBenchmark {


I'm hesitant on merging this one because it would lead to just one benchmark being disconnected from others. Is it possible to push this into a separate branch? we can merge for this one later, when I've brought rest of the benchmark tests to JMH as well.

agree, I did backup my changes adding jmh to another branch.

baldawar · 2023-01-03T19:40:34Z

I'm fine w merging this after the comments are addressed. If there's any update to the perf numbers on the README, feel free to update them as well.

baldawar · 2023-01-09T16:37:53Z

src/test/software/amazon/event/ruler/ComparableNumberTest.java

-import java.util.Map;
+import java.time.Duration;
+import java.time.Instant;
+import java.util.*;


why the wildcard import?

baldawar · 2023-01-09T16:42:25Z

Gentle reminder to do a minor version bump in pom.xml as part of the PR.

use fastdoubleparser

a5689e4

baldawar requested changes Nov 8, 2022

View reviewed changes

baldawar reviewed Nov 10, 2022

View reviewed changes

baldawar mentioned this pull request Nov 10, 2022

Speeding up Array Consistency Check by using sorted int-int AVL Tree #55

Closed

baldawar reviewed Jan 3, 2023

View reviewed changes

refactor and add basic benchmark

a391b1f

rudygt force-pushed the main branch from 4e6086d to a391b1f Compare January 7, 2023 13:23

baldawar reviewed Jan 9, 2023

View reviewed changes

rudygt added 2 commits January 9, 2023 19:10

remove unneded benchmark

0ccd5ba

bump version up

884585b

baldawar approved these changes Jan 10, 2023

View reviewed changes

baldawar merged commit dfdfa71 into aws:main Jan 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use fastdoubleparser to improve double parsing speed #54

use fastdoubleparser to improve double parsing speed #54

rudygt commented Nov 8, 2022

baldawar commented Nov 8, 2022

baldawar Nov 8, 2022

rudygt Nov 8, 2022

timbray commented Nov 8, 2022

baldawar commented Nov 10, 2022 •

edited

Loading

timbray commented Nov 10, 2022

rudygt commented Nov 10, 2022

baldawar Nov 10, 2022

timbray Nov 10, 2022

baldawar Dec 5, 2022

rudygt Dec 6, 2022

baldawar Dec 9, 2022

rudygt commented Dec 27, 2022

baldawar Jan 3, 2023

rudygt Jan 7, 2023 •

edited

Loading

baldawar Jan 9, 2023

rudygt Jan 9, 2023

baldawar Jan 10, 2023

baldawar Jan 3, 2023

rudygt Jan 7, 2023

baldawar commented Jan 3, 2023

baldawar Jan 9, 2023

baldawar commented Jan 9, 2023

		@State(Scope.Benchmark)
		public class ComparableNumberBenchmark {

use fastdoubleparser to improve double parsing speed #54

use fastdoubleparser to improve double parsing speed #54

Conversation

rudygt commented Nov 8, 2022

Description of changes:

Benchmark / Performance:

Original

Changes

baldawar commented Nov 8, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timbray commented Nov 8, 2022

baldawar commented Nov 10, 2022 • edited Loading

timbray commented Nov 10, 2022

rudygt commented Nov 10, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rudygt commented Dec 27, 2022

Choose a reason for hiding this comment

rudygt Jan 7, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

baldawar commented Jan 3, 2023

Choose a reason for hiding this comment

baldawar commented Jan 9, 2023

baldawar commented Nov 10, 2022 •

edited

Loading

rudygt Jan 7, 2023 •

edited

Loading