Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use std::countl_zero instead of __builtin_clz #7025

Merged
merged 1 commit into from
Sep 28, 2024

Conversation

SiarheiFedartsou
Copy link
Member

@SiarheiFedartsou SiarheiFedartsou commented Sep 1, 2024

It is also a microoptimisation since we had no specialization for unsigned char and called loop-based generic implementation for it...

Benchmark Results

Benchmark Base PR
alias aliased u32: 11069.7
plain u32: 11048.5
aliased double: 15115.3
plain double: 15180.6
aliased u32: 11074.7
plain u32: 10982.7
aliased double: 15061.9
plain double: 15050.6
e2e_match_ch Ops: 27.27 ± 0.02 ops/s. Best: 27.31 ops/s
Total: 4803.52ms ± 3.42ms. Best: 4796.74ms
Min time: 3.23ms ± 0.05ms
Mean time: 36.67ms ± 0.02ms
Median time: 24.95ms ± 0.20ms
95th percentile: 121.80ms ± 0.29ms
99th percentile: 146.91ms ± 0.26ms
Max time: 155.73ms ± 0.66ms
Ops: 27.20 ± 0.02 ops/s. Best: 27.23 ops/s
Total: 4816.91ms ± 3.59ms. Best: 4811.51ms
Min time: 3.28ms ± 0.03ms
Mean time: 36.77ms ± 0.03ms
Median time: 24.88ms ± 0.10ms
95th percentile: 122.82ms ± 0.14ms
99th percentile: 147.64ms ± 0.57ms
Max time: 156.20ms ± 0.55ms
e2e_match_mld Ops: 43.48 ± 0.03 ops/s. Best: 43.51 ops/s
Total: 3012.93ms ± 1.82ms. Best: 3010.51ms
Min time: 2.61ms ± 0.03ms
Mean time: 23.00ms ± 0.01ms
Median time: 12.06ms ± 0.10ms
95th percentile: 74.89ms ± 0.14ms
99th percentile: 87.10ms ± 0.13ms
Max time: 101.40ms ± 0.33ms
Ops: 43.67 ± 0.01 ops/s. Best: 43.69 ops/s
Total: 2999.48ms ± 0.93ms. Best: 2998.17ms
Min time: 2.56ms ± 0.01ms
Mean time: 22.90ms ± 0.01ms
Median time: 12.15ms ± 0.10ms
95th percentile: 74.45ms ± 0.21ms
99th percentile: 86.59ms ± 0.30ms
Max time: 100.60ms ± 0.50ms
e2e_nearest_ch Ops: 634.41 ± 2.83 ops/s. Best: 639.27 ops/s
Total: 1576.06ms ± 6.62ms. Best: 1564.27ms
Min time: 1.29ms ± 0.01ms
Mean time: 1.58ms ± 0.01ms
Median time: 1.54ms ± 0.01ms
95th percentile: 1.95ms ± 0.01ms
99th percentile: 2.04ms ± 0.02ms
Max time: 9.32ms ± 7.41ms
Ops: 637.44 ± 4.69 ops/s. Best: 646.62 ops/s
Total: 1569.00ms ± 12.79ms. Best: 1546.51ms
Min time: 1.28ms ± 0.01ms
Mean time: 1.57ms ± 0.01ms
Median time: 1.53ms ± 0.01ms
95th percentile: 1.96ms ± 0.02ms
99th percentile: 2.05ms ± 0.01ms
Max time: 9.32ms ± 7.37ms
e2e_nearest_mld Ops: 637.86 ± 5.64 ops/s. Best: 646.87 ops/s
Total: 1567.53ms ± 13.54ms. Best: 1545.91ms
Min time: 1.28ms ± 0.01ms
Mean time: 1.57ms ± 0.01ms
Median time: 1.53ms ± 0.01ms
95th percentile: 1.95ms ± 0.01ms
99th percentile: 2.03ms ± 0.01ms
Max time: 9.29ms ± 7.36ms
Ops: 636.39 ± 2.57 ops/s. Best: 641.54 ops/s
Total: 1571.25ms ± 6.58ms. Best: 1558.76ms
Min time: 1.28ms ± 0.01ms
Mean time: 1.57ms ± 0.01ms
Median time: 1.53ms ± 0.01ms
95th percentile: 1.97ms ± 0.01ms
99th percentile: 2.06ms ± 0.02ms
Max time: 9.41ms ± 7.48ms
e2e_route_ch Ops: 216.14 ± 0.61 ops/s. Best: 217.33 ops/s
Total: 4626.72ms ± 13.20ms. Best: 4601.23ms
Min time: 1.85ms ± 0.06ms
Mean time: 4.63ms ± 0.01ms
Median time: 4.73ms ± 0.01ms
95th percentile: 6.06ms ± 0.03ms
99th percentile: 6.65ms ± 0.03ms
Max time: 13.34ms ± 6.33ms
Ops: 215.67 ± 0.40 ops/s. Best: 216.33 ops/s
Total: 4636.53ms ± 8.51ms. Best: 4622.66ms
Min time: 1.88ms ± 0.04ms
Mean time: 4.64ms ± 0.01ms
Median time: 4.74ms ± 0.01ms
95th percentile: 6.08ms ± 0.01ms
99th percentile: 6.65ms ± 0.05ms
Max time: 13.31ms ± 6.36ms
e2e_route_mld Ops: 178.15 ± 0.25 ops/s. Best: 178.52 ops/s
Total: 5613.03ms ± 8.01ms. Best: 5601.67ms
Min time: 1.89ms ± 0.03ms
Mean time: 5.61ms ± 0.01ms
Median time: 5.73ms ± 0.01ms
95th percentile: 7.62ms ± 0.01ms
99th percentile: 8.14ms ± 0.05ms
Max time: 14.72ms ± 5.96ms
Ops: 178.09 ± 0.31 ops/s. Best: 178.58 ops/s
Total: 5614.88ms ± 9.71ms. Best: 5599.62ms
Min time: 1.86ms ± 0.06ms
Mean time: 5.61ms ± 0.01ms
Median time: 5.74ms ± 0.02ms
95th percentile: 7.63ms ± 0.03ms
99th percentile: 8.22ms ± 0.09ms
Max time: 14.70ms ± 5.92ms
e2e_table_ch Ops: 213.31 ± 0.34 ops/s. Best: 213.83 ops/s
Total: 4688.06ms ± 7.40ms. Best: 4676.63ms
Min time: 2.48ms ± 0.05ms
Mean time: 4.69ms ± 0.01ms
Median time: 4.69ms ± 0.01ms
95th percentile: 6.39ms ± 0.01ms
99th percentile: 6.72ms ± 0.02ms
Max time: 13.93ms ± 7.08ms
Ops: 212.93 ± 0.55 ops/s. Best: 213.57 ops/s
Total: 4696.15ms ± 12.20ms. Best: 4682.23ms
Min time: 2.44ms ± 0.04ms
Mean time: 4.70ms ± 0.01ms
Median time: 4.69ms ± 0.01ms
95th percentile: 6.41ms ± 0.01ms
99th percentile: 6.76ms ± 0.02ms
Max time: 13.93ms ± 7.08ms
e2e_table_mld Ops: 69.26 ± 0.04 ops/s. Best: 69.31 ops/s
Total: 14438.90ms ± 8.91ms. Best: 14427.40ms
Min time: 6.05ms ± 0.06ms
Mean time: 14.44ms ± 0.01ms
Median time: 14.38ms ± 0.01ms
95th percentile: 21.99ms ± 0.04ms
99th percentile: 23.29ms ± 0.08ms
Max time: 29.43ms ± 5.63ms
Ops: 69.61 ± 0.06 ops/s. Best: 69.70 ops/s
Total: 14365.38ms ± 11.95ms. Best: 14347.04ms
Min time: 5.97ms ± 0.04ms
Mean time: 14.37ms ± 0.01ms
Median time: 14.27ms ± 0.02ms
95th percentile: 21.89ms ± 0.03ms
99th percentile: 23.10ms ± 0.10ms
Max time: 29.52ms ± 5.66ms
e2e_trip_ch Ops: 62.41 ± 0.02 ops/s. Best: 62.43 ops/s
Total: 16023.15ms ± 6.47ms. Best: 16016.96ms
Min time: 2.49ms ± 0.12ms
Mean time: 16.02ms ± 0.01ms
Median time: 15.21ms ± 0.04ms
95th percentile: 28.21ms ± 0.04ms
99th percentile: 30.26ms ± 0.12ms
Max time: 33.01ms ± 1.35ms
Ops: 62.23 ± 0.05 ops/s. Best: 62.32 ops/s
Total: 16070.08ms ± 14.35ms. Best: 16045.40ms
Min time: 2.45ms ± 0.20ms
Mean time: 16.07ms ± 0.02ms
Median time: 15.30ms ± 0.04ms
95th percentile: 28.29ms ± 0.04ms
99th percentile: 30.28ms ± 0.16ms
Max time: 33.00ms ± 1.34ms
e2e_trip_mld Ops: 36.75 ± 0.01 ops/s. Best: 36.77 ops/s
Total: 27212.10ms ± 10.08ms. Best: 27195.74ms
Min time: 2.47ms ± 0.19ms
Mean time: 27.21ms ± 0.01ms
Median time: 26.35ms ± 0.09ms
95th percentile: 44.36ms ± 0.06ms
99th percentile: 47.05ms ± 0.16ms
Max time: 49.35ms ± 0.16ms
Ops: 36.76 ± 0.02 ops/s. Best: 36.79 ops/s
Total: 27203.11ms ± 18.49ms. Best: 27184.45ms
Min time: 2.36ms ± 0.12ms
Mean time: 27.20ms ± 0.02ms
Median time: 26.34ms ± 0.03ms
95th percentile: 44.29ms ± 0.07ms
99th percentile: 46.99ms ± 0.09ms
Max time: 49.36ms ± 0.20ms
json-render String: 8.93503ms
Stringstream: 14.1241ms
Vector: 9.50397ms
String: 8.96299ms
Stringstream: 14.6894ms
Vector: 9.43614ms
match_ch Default radius:
7.05052ms/req at 82 coordinate
0.0859819ms/coordinate
Radius 10m:
24.9526ms/req at 82 coordinate
0.3043ms/coordinate
Default radius:
7.08165ms/req at 82 coordinate
0.0863615ms/coordinate
Radius 10m:
25.0726ms/req at 82 coordinate
0.305764ms/coordinate
match_mld Default radius:
4.33738ms/req at 82 coordinate
0.0528949ms/coordinate
Radius 10m:
16.243ms/req at 82 coordinate
0.198085ms/coordinate
Default radius:
4.32495ms/req at 82 coordinate
0.0527433ms/coordinate
Radius 10m:
16.1954ms/req at 82 coordinate
0.197505ms/coordinate
node_match_ch Ops: 163.8 ± 0.7 ops/s. Best: 164.9 ops/s Ops: 164.5 ± 0.8 ops/s. Best: 165.7 ops/s
node_match_mld Ops: 219.9 ± 1.6 ops/s. Best: 222.2 ops/s Ops: 218.1 ± 0.8 ops/s. Best: 219.2 ops/s
node_nearest_ch Ops: 9923.7 ± 783.5 ops/s. Best: 11215.1 ops/s Ops: 10537.7 ± 855.8 ops/s. Best: 12030.1 ops/s
node_nearest_mld Ops: 10043.1 ± 678.1 ops/s. Best: 11151.6 ops/s Ops: 9528.5 ± 310.4 ops/s. Best: 10062.4 ops/s
node_route_ch Ops: 983.3 ± 26.5 ops/s. Best: 1025.5 ops/s Ops: 996.4 ± 28.3 ops/s. Best: 1023.7 ops/s
node_route_mld Ops: 509.0 ± 5.5 ops/s. Best: 515.4 ops/s Ops: 508.9 ± 2.8 ops/s. Best: 514.2 ops/s
node_table_ch Ops: 177.9 ± 2.0 ops/s. Best: 181.3 ops/s Ops: 178.0 ± 1.4 ops/s. Best: 180.6 ops/s
node_table_mld Ops: 37.9 ± 0.1 ops/s. Best: 38.0 ops/s Ops: 38.0 ± 0.1 ops/s. Best: 38.1 ops/s
node_trip_ch Ops: 180.9 ± 0.7 ops/s. Best: 181.7 ops/s Ops: 181.2 ± 0.7 ops/s. Best: 182.4 ops/s
node_trip_mld Ops: 60.6 ± 0.2 ops/s. Best: 60.9 ops/s Ops: 61.7 ± 0.2 ops/s. Best: 61.9 ops/s
osrm_contract Time: 185.71s Peak RAM: 194.96MB Time: 185.46s Peak RAM: 194.96MB
osrm_customize Time: 2.54s Peak RAM: 112.46MB Time: 2.52s Peak RAM: 112.46MB
osrm_extract Time: 24.36s Peak RAM: 398.62MB Time: 24.10s Peak RAM: 398.82MB
osrm_partition Time: 5.91s Peak RAM: 121.54MB Time: 5.90s Peak RAM: 121.59MB
packedvector random write:
std::vector 186041 ms
util::packed_vector 384681 ms
slowdown: 2.06772
random read:
std::vector 101158 ms
util::packed_vector 195032 ms
slowdown: 1.92799
random write:
std::vector 183440 ms
util::packed_vector 385823 ms
slowdown: 2.10326
random read:
std::vector 100386 ms
util::packed_vector 195612 ms
slowdown: 1.94859
random_match_ch 500 matches, default radius
ops: 121.49 ± 0.24 ops/s. best: 121.82ops/s.
total: 469.17 ± 0.94ms. best: 467.91ms.
avg: 8.23 ± 0.02ms
min: 0.23 ± 0.00ms
max: 43.10 ± 0.12ms
p99: 43.10 ± 0.12ms

500 matches, radius=10
ops: 35.07 ± 0.02 ops/s. best: 35.10ops/s.
total: 1824.79 ± 1.23ms. best: 1823.58ms.
avg: 28.51 ± 0.02ms
min: 0.23 ± 0.00ms
max: 430.14 ± 1.31ms
p99: 430.14 ± 1.31ms

500 matches, radius=20
ops: 8.16 ± 0.01 ops/s. best: 8.19ops/s.
total: 7961.59 ± 11.99ms. best: 7941.11ms.
avg: 122.49 ± 0.18ms
min: 0.48 ± 0.00ms
max: 2274.79 ± 4.45ms
p99: 2274.79 ± 4.45ms

Peak RAM: 55.500MB
500 matches, default radius
ops: 120.20 ± 0.24 ops/s. best: 120.47ops/s.
total: 474.21 ± 0.95ms. best: 473.14ms.
avg: 8.32 ± 0.02ms
min: 0.23 ± 0.00ms
max: 43.67 ± 0.12ms
p99: 43.67 ± 0.12ms

500 matches, radius=10
ops: 34.53 ± 0.02 ops/s. best: 34.55ops/s.
total: 1853.56 ± 1.09ms. best: 1852.60ms.
avg: 28.96 ± 0.02ms
min: 0.23 ± 0.00ms
max: 439.98 ± 1.22ms
p99: 439.98 ± 1.22ms

500 matches, radius=20
ops: 8.02 ± 0.01 ops/s. best: 8.04ops/s.
total: 8109.03 ± 12.74ms. best: 8087.87ms.
avg: 124.75 ± 0.20ms
min: 0.49 ± 0.00ms
max: 2341.63 ± 4.78ms
p99: 2341.63 ± 4.78ms

Peak RAM: 55.500MB
random_match_mld 500 matches, default radius
ops: 206.63 ± 0.46 ops/s. best: 207.03ops/s.
total: 275.86 ± 0.61ms. best: 275.32ms.
avg: 4.84 ± 0.01ms
min: 0.20 ± 0.00ms
max: 27.13 ± 0.01ms
p99: 27.13 ± 0.01ms

500 matches, radius=10
ops: 73.08 ± 0.10 ops/s. best: 73.29ops/s.
total: 875.71 ± 1.24ms. best: 873.24ms.
avg: 13.68 ± 0.02ms
min: 0.22 ± 0.00ms
max: 162.04 ± 0.71ms
p99: 162.04 ± 0.71ms

500 matches, radius=20
ops: 15.58 ± 0.01 ops/s. best: 15.59ops/s.
total: 4172.86 ± 2.70ms. best: 4168.61ms.
avg: 64.20 ± 0.04ms
min: 0.29 ± 0.00ms
max: 844.20 ± 1.34ms
p99: 844.20 ± 1.34ms

Peak RAM: 51.500MB
500 matches, default radius
ops: 206.96 ± 0.51 ops/s. best: 207.40ops/s.
total: 275.42 ± 0.68ms. best: 274.83ms.
avg: 4.83 ± 0.01ms
min: 0.20 ± 0.00ms
max: 27.06 ± 0.02ms
p99: 27.06 ± 0.02ms

500 matches, radius=10
ops: 73.19 ± 0.10 ops/s. best: 73.39ops/s.
total: 874.45 ± 1.15ms. best: 872.02ms.
avg: 13.66 ± 0.02ms
min: 0.22 ± 0.00ms
max: 161.82 ± 0.65ms
p99: 161.82 ± 0.65ms

500 matches, radius=20
ops: 15.59 ± 0.01 ops/s. best: 15.61ops/s.
total: 4168.91 ± 2.78ms. best: 4163.56ms.
avg: 64.14 ± 0.04ms
min: 0.29 ± 0.00ms
max: 842.88 ± 1.38ms
p99: 842.88 ± 1.38ms

Peak RAM: 51.500MB
random_nearest_ch 10000 nearest, number_of_results=1
ops: 21816.96 ± 38.23 ops/s. best: 21870.55ops/s.
total: 458.36 ± 0.89ms. best: 457.24ms.
avg: 0.05 ± 0.00ms
min: 0.02 ± 0.00ms
max: 0.16 ± 0.03ms
p99: 0.10 ± 0.00ms

10000 nearest, number_of_results=5
ops: 16064.10 ± 16.54 ops/s. best: 16080.89ops/s.
total: 622.51 ± 0.64ms. best: 621.86ms.
avg: 0.06 ± 0.00ms
min: 0.03 ± 0.00ms
max: 0.15 ± 0.00ms
p99: 0.12 ± 0.00ms

10000 nearest, number_of_results=10
ops: 12284.95 ± 6.32 ops/s. best: 12292.23ops/s.
total: 814.00 ± 0.42ms. best: 813.52ms.
avg: 0.08 ± 0.00ms
min: 0.04 ± 0.00ms
max: 0.19 ± 0.00ms
p99: 0.15 ± 0.00ms

Peak RAM: 35.000MB
10000 nearest, number_of_results=1
ops: 21871.46 ± 42.45 ops/s. best: 21908.22ops/s.
total: 457.22 ± 0.89ms. best: 456.45ms.
avg: 0.05 ± 0.00ms
min: 0.02 ± 0.00ms
max: 0.16 ± 0.03ms
p99: 0.10 ± 0.00ms

10000 nearest, number_of_results=5
ops: 16156.24 ± 7.22 ops/s. best: 16165.27ops/s.
total: 618.96 ± 0.28ms. best: 618.61ms.
avg: 0.06 ± 0.00ms
min: 0.03 ± 0.00ms
max: 0.16 ± 0.01ms
p99: 0.12 ± 0.00ms

10000 nearest, number_of_results=10
ops: 12402.64 ± 5.02 ops/s. best: 12407.32ops/s.
total: 806.28 ± 0.33ms. best: 805.98ms.
avg: 0.08 ± 0.00ms
min: 0.04 ± 0.00ms
max: 0.19 ± 0.00ms
p99: 0.15 ± 0.00ms

Peak RAM: 35.000MB
random_nearest_mld 10000 nearest, number_of_results=1
ops: 21844.41 ± 37.74 ops/s. best: 21875.03ops/s.
total: 457.79 ± 0.81ms. best: 457.14ms.
avg: 0.05 ± 0.00ms
min: 0.02 ± 0.00ms
max: 0.15 ± 0.03ms
p99: 0.10 ± 0.00ms

10000 nearest, number_of_results=5
ops: 16073.01 ± 9.85 ops/s. best: 16083.18ops/s.
total: 622.16 ± 0.39ms. best: 621.77ms.
avg: 0.06 ± 0.00ms
min: 0.03 ± 0.00ms
max: 0.15 ± 0.00ms
p99: 0.12 ± 0.00ms

10000 nearest, number_of_results=10
ops: 12283.20 ± 3.86 ops/s. best: 12289.47ops/s.
total: 814.12 ± 0.26ms. best: 813.70ms.
avg: 0.08 ± 0.00ms
min: 0.04 ± 0.00ms
max: 0.20 ± 0.01ms
p99: 0.15 ± 0.00ms

Peak RAM: 35.000MB
10000 nearest, number_of_results=1
ops: 21889.40 ± 36.89 ops/s. best: 21918.04ops/s.
total: 456.84 ± 0.77ms. best: 456.25ms.
avg: 0.05 ± 0.00ms
min: 0.02 ± 0.00ms
max: 0.15 ± 0.02ms
p99: 0.10 ± 0.00ms

10000 nearest, number_of_results=5
ops: 16173.01 ± 5.54 ops/s. best: 16178.22ops/s.
total: 618.31 ± 0.21ms. best: 618.11ms.
avg: 0.06 ± 0.00ms
min: 0.03 ± 0.00ms
max: 0.15 ± 0.00ms
p99: 0.12 ± 0.00ms

10000 nearest, number_of_results=10
ops: 12407.32 ± 5.47 ops/s. best: 12414.73ops/s.
total: 805.98 ± 0.35ms. best: 805.49ms.
avg: 0.08 ± 0.00ms
min: 0.04 ± 0.00ms
max: 0.20 ± 0.01ms
p99: 0.15 ± 0.00ms

Peak RAM: 35.000MB
random_route_ch 1000 routes, 3 coordinates, no alternatives, overview=full, steps=true
ops: 282.93 ± 0.18 ops/s. best: 283.19ops/s.
total: 3477.95 ± 2.20ms. best: 3474.68ms.
avg: 3.53 ± 0.00ms
min: 0.54 ± 0.00ms
max: 6.09 ± 0.05ms
p99: 5.28 ± 0.02ms

1000 routes, 2 coordinates, 3 alternatives, overview=full, steps=true
ops: 330.77 ± 0.07 ops/s. best: 330.87ops/s.
total: 3023.24 ± 0.66ms. best: 3022.33ms.
avg: 3.02 ± 0.00ms
min: 0.08 ± 0.00ms
max: 7.91 ± 0.05ms
p99: 7.10 ± 0.01ms

1000 routes, 3 coordinates, no alternatives, overview=false, steps=false
ops: 578.75 ± 0.18 ops/s. best: 578.92ops/s.
total: 1700.23 ± 0.53ms. best: 1699.71ms.
avg: 1.73 ± 0.00ms
min: 0.37 ± 0.00ms
max: 2.78 ± 0.01ms
p99: 2.43 ± 0.00ms

1000 routes, 2 coordinates, 3 alternatives, overview=false, steps=false
ops: 626.96 ± 0.58 ops/s. best: 628.03ops/s.
total: 1595.00 ± 1.48ms. best: 1592.28ms.
avg: 1.59 ± 0.00ms
min: 0.06 ± 0.00ms
max: 4.22 ± 0.03ms
p99: 3.96 ± 0.01ms

Peak RAM: 84.000MB
1000 routes, 3 coordinates, no alternatives, overview=full, steps=true
ops: 281.47 ± 0.12 ops/s. best: 281.57ops/s.
total: 3495.94 ± 1.51ms. best: 3494.70ms.
avg: 3.55 ± 0.00ms
min: 0.51 ± 0.00ms
max: 6.04 ± 0.01ms
p99: 5.28 ± 0.00ms

1000 routes, 2 coordinates, 3 alternatives, overview=full, steps=true
ops: 331.21 ± 0.05 ops/s. best: 331.28ops/s.
total: 3019.22 ± 0.45ms. best: 3018.60ms.
avg: 3.02 ± 0.00ms
min: 0.08 ± 0.00ms
max: 7.91 ± 0.05ms
p99: 7.10 ± 0.01ms

1000 routes, 3 coordinates, no alternatives, overview=false, steps=false
ops: 579.30 ± 0.09 ops/s. best: 579.42ops/s.
total: 1698.61 ± 0.25ms. best: 1698.24ms.
avg: 1.73 ± 0.00ms
min: 0.37 ± 0.00ms
max: 2.76 ± 0.01ms
p99: 2.45 ± 0.01ms

1000 routes, 2 coordinates, 3 alternatives, overview=false, steps=false
ops: 631.73 ± 0.05 ops/s. best: 631.81ops/s.
total: 1582.95 ± 0.13ms. best: 1582.75ms.
avg: 1.58 ± 0.00ms
min: 0.06 ± 0.00ms
max: 4.19 ± 0.00ms
p99: 3.92 ± 0.00ms

Peak RAM: 84.000MB
random_route_mld 1000 routes, 3 coordinates, no alternatives, overview=full, steps=true
ops: 146.49 ± 0.03 ops/s. best: 146.52ops/s.
total: 6717.05 ± 1.20ms. best: 6715.67ms.
avg: 6.83 ± 0.00ms
min: 0.53 ± 0.01ms
max: 16.52 ± 0.02ms
p99: 11.12 ± 0.00ms

1000 routes, 2 coordinates, 3 alternatives, overview=full, steps=true
ops: 139.44 ± 0.00 ops/s. best: 139.44ops/s.
total: 7171.70 ± 0.25ms. best: 7171.38ms.
avg: 7.17 ± 0.00ms
min: 0.08 ± 0.00ms
max: 16.49 ± 0.02ms
p99: 15.54 ± 0.04ms

1000 routes, 3 coordinates, no alternatives, overview=false, steps=false
ops: 204.19 ± 0.02 ops/s. best: 204.23ops/s.
total: 4819.03 ± 0.56ms. best: 4818.18ms.
avg: 4.90 ± 0.00ms
min: 0.39 ± 0.01ms
max: 13.73 ± 0.01ms
p99: 8.45 ± 0.00ms

1000 routes, 2 coordinates, 3 alternatives, overview=false, steps=false
ops: 175.75 ± 0.02 ops/s. best: 175.78ops/s.
total: 5689.90 ± 0.81ms. best: 5688.88ms.
avg: 5.69 ± 0.00ms
min: 0.06 ± 0.00ms
max: 12.84 ± 0.03ms
p99: 11.91 ± 0.02ms

Peak RAM: 73.984MB
1000 routes, 3 coordinates, no alternatives, overview=full, steps=true
ops: 146.89 ± 0.03 ops/s. best: 146.92ops/s.
total: 6699.11 ± 1.32ms. best: 6697.58ms.
avg: 6.81 ± 0.00ms
min: 0.53 ± 0.00ms
max: 16.47 ± 0.02ms
p99: 11.09 ± 0.01ms

1000 routes, 2 coordinates, 3 alternatives, overview=full, steps=true
ops: 140.00 ± 0.03 ops/s. best: 140.04ops/s.
total: 7142.85 ± 1.34ms. best: 7140.74ms.
avg: 7.14 ± 0.00ms
min: 0.07 ± 0.00ms
max: 16.42 ± 0.02ms
p99: 15.46 ± 0.03ms

1000 routes, 3 coordinates, no alternatives, overview=false, steps=false
ops: 204.64 ± 0.02 ops/s. best: 204.68ops/s.
total: 4808.41 ± 0.43ms. best: 4807.53ms.
avg: 4.89 ± 0.00ms
min: 0.39 ± 0.00ms
max: 13.68 ± 0.02ms
p99: 8.44 ± 0.01ms

1000 routes, 2 coordinates, 3 alternatives, overview=false, steps=false
ops: 176.21 ± 0.02 ops/s. best: 176.23ops/s.
total: 5675.17 ± 0.52ms. best: 5674.50ms.
avg: 5.68 ± 0.00ms
min: 0.06 ± 0.00ms
max: 12.81 ± 0.03ms
p99: 11.87 ± 0.01ms

Peak RAM: 73.984MB
random_table_ch 250 tables, 3 coordinates
ops: 1099.06 ± 4.31 ops/s. best: 1102.88ops/s.
total: 227.47 ± 0.90ms. best: 226.68ms.
avg: 0.91 ± 0.00ms
min: 0.61 ± 0.01ms
max: 1.29 ± 0.17ms
p99: 1.15 ± 0.01ms

250 tables, 25 coordinates
ops: 122.93 ± 0.04 ops/s. best: 122.97ops/s.
total: 2033.60 ± 0.60ms. best: 2032.98ms.
avg: 8.13 ± 0.00ms
min: 7.30 ± 0.01ms
max: 9.01 ± 0.01ms
p99: 8.79 ± 0.02ms

250 tables, 50 coordinates
ops: 59.73 ± 0.01 ops/s. best: 59.74ops/s.
total: 4185.51 ± 0.53ms. best: 4184.87ms.
avg: 16.74 ± 0.00ms
min: 15.55 ± 0.01ms
max: 17.74 ± 0.01ms
p99: 17.64 ± 0.01ms

Peak RAM: 63.000MB
250 tables, 3 coordinates
ops: 1103.16 ± 4.60 ops/s. best: 1106.71ops/s.
total: 226.63 ± 0.95ms. best: 225.89ms.
avg: 0.91 ± 0.00ms
min: 0.61 ± 0.00ms
max: 1.28 ± 0.17ms
p99: 1.15 ± 0.01ms

250 tables, 25 coordinates
ops: 123.65 ± 0.03 ops/s. best: 123.68ops/s.
total: 2021.85 ± 0.53ms. best: 2021.27ms.
avg: 8.09 ± 0.00ms
min: 7.26 ± 0.00ms
max: 8.96 ± 0.01ms
p99: 8.71 ± 0.01ms

250 tables, 50 coordinates
ops: 60.24 ± 0.00 ops/s. best: 60.24ops/s.
total: 4150.22 ± 0.22ms. best: 4149.90ms.
avg: 16.60 ± 0.00ms
min: 15.43 ± 0.02ms
max: 17.57 ± 0.01ms
p99: 17.46 ± 0.00ms

Peak RAM: 63.000MB
random_table_mld 250 tables, 3 coordinates
ops: 224.10 ± 0.36 ops/s. best: 224.55ops/s.
total: 1115.59 ± 2.03ms. best: 1113.34ms.
avg: 4.46 ± 0.01ms
min: 3.52 ± 0.01ms
max: 5.75 ± 0.03ms
p99: 5.56 ± 0.06ms

250 tables, 25 coordinates
ops: 23.59 ± 0.01 ops/s. best: 23.60ops/s.
total: 10597.94 ± 2.64ms. best: 10593.05ms.
avg: 42.39 ± 0.01ms
min: 39.10 ± 0.05ms
max: 46.31 ± 0.03ms
p99: 45.81 ± 0.09ms

250 tables, 50 coordinates
ops: 10.90 ± 0.00 ops/s. best: 10.90ops/s.
total: 22943.27 ± 7.06ms. best: 22929.13ms.
avg: 91.77 ± 0.03ms
min: 86.90 ± 0.13ms
max: 97.13 ± 0.22ms
p99: 95.63 ± 0.08ms

Peak RAM: 63.281MB
250 tables, 3 coordinates
ops: 228.61 ± 0.28 ops/s. best: 228.83ops/s.
total: 1093.56 ± 1.33ms. best: 1092.50ms.
avg: 4.37 ± 0.01ms
min: 3.48 ± 0.00ms
max: 5.61 ± 0.03ms
p99: 5.43 ± 0.06ms

250 tables, 25 coordinates
ops: 24.11 ± 0.01 ops/s. best: 24.12ops/s.
total: 10370.58 ± 2.91ms. best: 10365.08ms.
avg: 41.48 ± 0.01ms
min: 38.31 ± 0.05ms
max: 45.30 ± 0.04ms
p99: 44.70 ± 0.07ms

250 tables, 50 coordinates
ops: 11.12 ± 0.00 ops/s. best: 11.13ops/s.
total: 22479.86 ± 8.73ms. best: 22471.23ms.
avg: 89.92 ± 0.03ms
min: 85.00 ± 0.05ms
max: 95.76 ± 0.24ms
p99: 94.00 ± 0.20ms

Peak RAM: 63.281MB
random_trip_ch 250 trips, 3 coordinates
ops: 319.26 ± 0.43 ops/s. best: 319.58ops/s.
total: 783.06 ± 1.05ms. best: 782.27ms.
avg: 3.13 ± 0.00ms
min: 1.62 ± 0.01ms
max: 4.36 ± 0.01ms
p99: 4.23 ± 0.01ms

250 trips, 5 coordinates
ops: 207.43 ± 0.03 ops/s. best: 207.47ops/s.
total: 1205.22 ± 0.16ms. best: 1205.01ms.
avg: 4.82 ± 0.00ms
min: 3.18 ± 0.01ms
max: 6.21 ± 0.00ms
p99: 6.01 ± 0.01ms

Peak RAM: 74.000MB
250 trips, 3 coordinates
ops: 320.69 ± 0.49 ops/s. best: 321.10ops/s.
total: 779.58 ± 1.20ms. best: 778.57ms.
avg: 3.12 ± 0.00ms
min: 1.63 ± 0.01ms
max: 4.33 ± 0.00ms
p99: 4.21 ± 0.01ms

250 trips, 5 coordinates
ops: 206.85 ± 0.07 ops/s. best: 206.95ops/s.
total: 1208.62 ± 0.44ms. best: 1208.02ms.
avg: 4.83 ± 0.00ms
min: 3.18 ± 0.00ms
max: 6.27 ± 0.01ms
p99: 6.01 ± 0.01ms

Peak RAM: 74.000MB
random_trip_mld 250 trips, 3 coordinates
ops: 106.69 ± 0.10 ops/s. best: 106.85ops/s.
total: 2343.34 ± 2.12ms. best: 2339.77ms.
avg: 9.37 ± 0.01ms
min: 5.44 ± 0.01ms
max: 12.30 ± 0.12ms
p99: 12.06 ± 0.01ms

250 trips, 5 coordinates
ops: 68.86 ± 0.04 ops/s. best: 68.92ops/s.
total: 3630.80 ± 1.92ms. best: 3627.39ms.
avg: 14.52 ± 0.01ms
min: 10.13 ± 0.03ms
max: 18.05 ± 0.11ms
p99: 17.39 ± 0.01ms

Peak RAM: 69.500MB
250 trips, 3 coordinates
ops: 107.60 ± 0.06 ops/s. best: 107.66ops/s.
total: 2323.42 ± 1.25ms. best: 2322.02ms.
avg: 9.29 ± 0.00ms
min: 5.43 ± 0.01ms
max: 12.12 ± 0.01ms
p99: 11.97 ± 0.01ms

250 trips, 5 coordinates
ops: 69.25 ± 0.02 ops/s. best: 69.28ops/s.
total: 3610.09 ± 0.97ms. best: 3608.37ms.
avg: 14.44 ± 0.00ms
min: 10.06 ± 0.01ms
max: 17.82 ± 0.01ms
p99: 17.28 ± 0.02ms

Peak RAM: 69.500MB
route_ch 1000 routes, 3 coordinates, no alternatives, overview=full, steps=true
638.32ms
0.63832ms/req
1000 routes, 2 coordinates, 3 alternatives, overview=full, steps=true
781.296ms
0.781296ms/req
1000 routes, 3 coordinates, no alternatives, overview=false, steps=false
245.44ms
0.24544ms/req
1000 routes, 2 coordinates, 3 alternatives, overview=false, steps=false
216.193ms
0.216193ms/req
1000 routes, 3 coordinates, no alternatives, overview=full, steps=true
637.409ms
0.637409ms/req
1000 routes, 2 coordinates, 3 alternatives, overview=full, steps=true
779.065ms
0.779065ms/req
1000 routes, 3 coordinates, no alternatives, overview=false, steps=false
249.703ms
0.249703ms/req
1000 routes, 2 coordinates, 3 alternatives, overview=false, steps=false
218.527ms
0.218527ms/req
route_mld 1000 routes, 3 coordinates, no alternatives, overview=full, steps=true
807.942ms
0.807942ms/req
1000 routes, 2 coordinates, 3 alternatives, overview=full, steps=true
1039.07ms
1.03907ms/req
1000 routes, 3 coordinates, no alternatives, overview=false, steps=false
414.341ms
0.414341ms/req
1000 routes, 2 coordinates, 3 alternatives, overview=false, steps=false
461.84ms
0.46184ms/req
1000 routes, 3 coordinates, no alternatives, overview=full, steps=true
809.449ms
0.809449ms/req
1000 routes, 2 coordinates, 3 alternatives, overview=full, steps=true
1039.48ms
1.03948ms/req
1000 routes, 3 coordinates, no alternatives, overview=false, steps=false
417.137ms
0.417137ms/req
1000 routes, 2 coordinates, 3 alternatives, overview=false, steps=false
464.268ms
0.464268ms/req
rtree 1 result:
227.492ms -> 0.0227492 ms/query
10 results:
258.177ms -> 0.0258177 ms/query
1 result:
227.203ms -> 0.0227203 ms/query
10 results:
258.055ms -> 0.0258055 ms/query

@SiarheiFedartsou SiarheiFedartsou changed the title Use std::countl_zero instead of __builtin_clzll Use std::countl_zero instead of __builtin_clz Sep 1, 2024
@SiarheiFedartsou SiarheiFedartsou marked this pull request as ready for review September 1, 2024 13:01
@SiarheiFedartsou
Copy link
Member Author

I'll take a liberty to merge it without review...

@SiarheiFedartsou SiarheiFedartsou merged commit f636dbf into master Sep 28, 2024
22 checks passed
@SiarheiFedartsou SiarheiFedartsou deleted the sf-use-std-countl-zero branch September 28, 2024 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant