Blur Algorithms and Implementations Test in C++

Tested algorithms and implementations:

Stack Blur _{(Anti-Grain Geometry 2.5 by Maxim Shemanarev)}
Recursive Blur _{(Anti-Grain Geometry 2.5 by Maxim Shemanarev)}
My unoptimized implementation of Stack Blur
My optimized implementations of Stack Blur using SSE2, SSSE3, SSE4.1

Note: AGG versions was slightly modified to be able to use them with multiple threads and to suppress some compile warnings.

The Stack Blur algorithm was invented by Mario Klingemann.
[email protected]
https://medium.com/@quasimondo

AGG - Anti-Grain Geometry - a library that was written by Maxim Shemanarev.

Do not use SIMD versions with disabled compiler optimizations. They'll be too slow.
Use at least '-O1' optimization level (GCC and Clang).

All tested versions use 32-bit (8 bits per component, order of components is not important) pixel format.

Clang do much better optimizations with same flags than GCC. Both tested are from MSYS2/MinGW64 toolchain.

The fastest implementation I could write is about 0.7ms for a 1280x720 32bpp frame on an AMD Ryzen 7 2700 with SSE4.1 and 16 threads.

Parallel 'for' loop range distribution

Suppose, we have loop:

for ( int i = 0; i < 8; i++ ) ...

And we want to break him into 3 threads.
Thus, 8 iterations / 3 threads = 3 threads with 2 iterations + 2 remained iterations.

Old algorithm (remainder is added to the last thread):

Thread #1: [0;2) - size 2
Thread #2: [2;4) - size 2
Thread #3: [4;8) - size 4

This approach has two disadvantages:

Non-uniform range distribution
The last thread have biggest block size and (oftenly, not always) begins its execution after all previous threads already has started

New algorithm (remainder is distributed over first threads):

Thread #1: [0;3) - size 3
Thread #2: [3;6) - size 3
Thread #3: [6;8) - size 2

TODO:

Recursive Blur SIMD version
Gaussian Blur SIMD version

Example on Youtube (need to be updated):

Name		Name	Last commit message	Last commit date
Latest commit History 253 Commits
.github/workflows		.github/workflows
fonts		fonts
pics		pics
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
BigBlurTest.cpp		BigBlurTest.cpp
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
_cmake_clang_dbg.cmd		_cmake_clang_dbg.cmd
_cmake_vs17_dbg.cmd		_cmake_vs17_dbg.cmd
screenshot.jpg		screenshot.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Blur Algorithms and Implementations Test in C++

Parallel 'for' loop range distribution

Old algorithm (remainder is added to the last thread):

New algorithm (remainder is distributed over first threads):

About

Releases

Languages

License

AntonSazonov/BigBlurTest

Folders and files

Latest commit

History

Repository files navigation

Blur Algorithms and Implementations Test in C++

Parallel 'for' loop range distribution

Old algorithm (remainder is added to the last thread):

New algorithm (remainder is distributed over first threads):

About

Resources

License

Stars

Watchers

Forks

Releases

Languages