Skip to content

v0.3: 2nd year report for Russian Science Foundation

Pre-release
Pre-release
Compare
Choose a tag to compare
@mbuzdalov mbuzdalov released this 29 May 17:01
· 131 commits to param-control-production since this release

This release features two components.

A report on speed-ups in parallel non-dominated sorting.

On input sizes of 1000000 the divide-and-conquer algorithm, hybridized with ENS-SS, is able to achieve a speed-up of 4.44 with 8 threads.

The same algorithm hybridized with ENS-NDT, due to a higher threshold value, is not that good in parallelization: the maximum speed-up observed below is 2.721 with 8 threads.

The general trend is that the possible speed-ups increase with both the number of points and the dimension. Due to the divide-and-conquer algorithm structure, there is no parallelization available for dimensions 2 and 3 (for which the algorithm is already pretty fast).

First statistics on threshold tuning in hybrid algorithms.


Jensen-Fortin-Buzdalov hybridized with ENS-SS

Dataset: Uniform Hypercube

2 threads (best speed-up 1.889)

N\D 4 5 6 7 8 9 10
100000 1.409 1.634 1.696 1.707 1.736 1.703 1.719
316227 1.502 1.725 1.764 1.798 1.837 1.817 1.844
1000000 1.568 1.739 1.839 1.887 1.830 1.866 1.889

3 threads (best speed-up 3.020)

N\D 4 5 6 7 8 9 10
100000 1.504 1.953 2.125 2.285 2.258 1.905 2.265
316227 1.654 2.193 2.368 2.552 2.665 1.952 2.595
1000000 1.787 2.398 2.726 2.924 2.943 3.020 2.958

4 threads (best speed-up 3.351)

N\D 4 5 6 7 8 9 10
100000 1.702 2.228 2.477 2.566 2.607 2.563 2.572
316227 1.916 2.507 2.802 2.954 2.995 3.024 3.053
1000000 2.057 2.704 3.049 3.250 3.261 3.268 3.351

5 threads (best speed-up 3.501)

N\D 4 5 6 7 8 9 10
100000 1.701 2.242 2.522 2.550 2.662 2.631 2.642
316227 1.921 2.583 2.922 3.057 3.094 3.145 3.156
1000000 2.089 2.780 3.209 3.442 3.384 3.445 3.501

6 threads (best speed-up 3.994)

N\D 4 5 6 7 8 9 10
100000 1.745 2.299 2.663 2.779 2.844 2.834 2.846
316227 1.978 2.756 3.146 3.314 3.468 3.493 3.524
1000000 2.160 3.010 3.531 3.818 3.927 3.904 3.994

7 threads (best speed-up 4.283)

N\D 4 5 6 7 8 9 10
100000 1.760 2.368 2.733 2.815 2.892 2.848 2.881
316227 2.036 2.821 3.285 3.507 3.612 3.615 3.641
1000000 2.249 3.201 3.715 4.096 4.122 4.165 4.283

8 threads (best speed-up 4.446)

N\D 4 5 6 7 8 9 10
100000 1.798 2.451 2.797 2.889 2.995 2.927 2.971
316227 2.094 2.933 3.398 3.623 3.778 3.780 3.816
1000000 2.318 3.263 3.894 4.252 4.311 4.369 4.446

Dataset: Uniform Hyperplane

2 threads (best speed-up 1.871)

N\D 4 5 6 7 8 9 10
100000 1.281 1.508 1.617 1.602 1.678 1.677 1.721
316227 1.393 1.628 1.712 1.756 1.765 1.818 1.777
1000000 1.452 1.659 1.798 1.779 1.847 1.869 1.871

3 threads (best speed-up 3.302)

N\D 4 5 6 7 8 9 10
100000 1.350 1.793 2.109 2.035 2.231 2.254 2.405
316227 1.492 2.020 2.471 2.644 2.753 2.708 2.800
1000000 1.602 2.329 2.766 2.954 3.055 3.150 3.302

4 threads (best speed-up 3.594)

N\D 4 5 6 7 8 9 10
100000 1.432 1.856 2.177 2.149 2.230 2.319 2.479
316227 1.616 2.213 2.565 2.704 2.868 3.022 3.097
1000000 1.785 2.455 3.036 3.338 3.594 3.175 3.527

5 threads (best speed-up 3.737)

N\D 4 5 6 7 8 9 10
100000 1.454 1.961 2.196 2.310 2.374 2.489 2.624
316227 1.674 2.231 2.664 2.802 2.932 2.979 2.991
1000000 1.848 2.668 3.275 3.006 3.604 3.347 3.737

6 threads (best speed-up 3.878)

N\D 4 5 6 7 8 9 10
100000 1.453 2.007 2.285 2.386 2.572 2.673 2.732
316227 1.692 2.403 2.863 3.044 3.184 3.292 3.294
1000000 1.888 2.710 3.210 3.420 3.677 3.722 3.878

7 threads (best speed-up 4.098)

N\D 4 5 6 7 8 9 10
100000 1.490 2.011 2.408 2.473 2.639 2.709 2.731
316227 1.705 2.438 2.919 3.098 3.292 3.487 3.444
1000000 1.889 2.769 3.393 3.587 3.841 3.970 4.098

8 threads (best speed-up 4.272)

N\D 4 5 6 7 8 9 10
100000 1.498 2.097 2.419 2.531 2.704 2.720 2.869
316227 1.737 2.506 3.014 3.216 3.357 3.536 3.564
1000000 1.929 2.861 3.397 3.650 4.025 4.142 4.272

Jensen-Fortin-Buzdalov hybridized with ENS-NDT

Dataset: Uniform Hypercube

2 threads (best speed-up 1.713)

N\D 4 5 6 7 8 9 10
100000 1.353 1.400 1.352 1.312 1.330 1.363 1.372
316227 1.378 1.497 1.502 1.471 1.497 1.485 1.511
1000000 1.570 1.622 1.663 1.683 1.679 1.693 1.713

3 threads (best speed-up 2.114)

N\D 4 5 6 7 8 9 10
100000 1.440 1.536 1.410 1.359 1.342 1.383 1.386
316227 1.527 1.725 1.577 1.580 1.613 1.550 1.554
1000000 1.753 2.102 2.114 2.091 2.008 2.108 2.060

4 threads (best speed-up 2.440)

N\D 4 5 6 7 8 9 10
100000 1.565 1.670 1.538 1.479 1.468 1.526 1.531
316227 1.707 1.876 1.837 1.816 1.764 1.798 1.784
1000000 2.063 2.335 2.382 2.401 2.372 2.440 2.352

5 threads (best speed-up 2.479)

N\D 4 5 6 7 8 9 10
100000 1.596 1.660 1.553 1.491 1.470 1.514 1.538
316227 1.709 1.904 1.883 1.821 1.795 1.815 1.856
1000000 2.098 2.375 2.441 2.463 2.441 2.445 2.479

6 threads (best speed-up 2.671)

N\D 4 5 6 7 8 9 10
100000 1.630 1.693 1.533 1.492 1.508 1.517 1.532
316227 1.772 1.992 1.908 1.873 1.846 1.839 1.923
1000000 2.235 2.545 2.561 2.610 2.586 2.596 2.671

7 threads (best speed-up 2.677)

N\D 4 5 6 7 8 9 10
100000 1.675 1.708 1.553 1.511 1.505 1.512 1.521
316227 1.808 2.030 1.914 1.880 1.837 1.839 1.893
1000000 2.290 2.621 2.632 2.633 2.604 2.637 2.677

8 threads (best speed-up 2.721)

N\D 4 5 6 7 8 9 10
100000 1.687 1.731 1.557 1.492 1.484 1.505 1.491
316227 1.845 2.040 1.942 1.860 1.842 1.837 1.890
1000000 2.346 2.669 2.675 2.667 2.645 2.675 2.721

Dataset: Uniform Hyperplane

2 threads (best speed-up 1.623)

N\D 4 5 6 7 8 9 10
100000 1.260 1.312 1.288 1.292 1.312 1.296 1.280
316227 1.328 1.407 1.378 1.384 1.390 1.421 1.431
1000000 1.419 1.565 1.563 1.560 1.584 1.598 1.623

3 threads (best speed-up 2.005)

N\D 4 5 6 7 8 9 10
100000 1.282 1.428 1.336 1.306 1.319 1.315 1.327
316227 1.472 1.636 1.529 1.480 1.429 1.451 1.440
1000000 1.600 1.966 1.990 1.907 1.883 1.645 2.005

4 threads (best speed-up 2.165)

N\D 4 5 6 7 8 9 10
100000 1.469 1.488 1.426 1.397 1.375 1.419 1.457
316227 1.582 1.717 1.615 1.605 1.626 1.689 1.673
1000000 1.782 2.099 2.128 2.076 2.128 2.146 2.165

5 threads (best speed-up 2.266)

N\D 4 5 6 7 8 9 10
100000 1.503 1.496 1.405 1.401 1.426 1.432 1.455
316227 1.616 1.724 1.624 1.617 1.629 1.680 1.681
1000000 1.834 2.127 2.139 2.134 2.188 2.207 2.266

6 threads (best speed-up 2.389)

N\D 4 5 6 7 8 9 10
100000 1.509 1.499 1.406 1.402 1.426 1.421 1.443
316227 1.620 1.762 1.674 1.603 1.652 1.694 1.704
1000000 1.865 2.222 2.233 2.241 2.259 2.302 2.389

7 threads (best speed-up 2.390)

N\D 4 5 6 7 8 9 10
100000 1.523 1.464 1.405 1.411 1.422 1.432 1.468
316227 1.653 1.758 1.633 1.615 1.641 1.690 1.717
1000000 1.910 2.251 2.232 2.234 2.275 2.340 2.390

8 threads (best speed-up 2.408)

N\D 4 5 6 7 8 9 10
100000 1.537 1.484 1.407 1.415 1.414 1.403 1.460
316227 1.673 1.777 1.673 1.601 1.649 1.669 1.703
1000000 1.918 2.274 2.257 2.239 2.299 2.324 2.408