You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is room for documentation of best practices, either in the Crystal book or a wiki somewhere. The particular one I have hit is regarding image processing:
Using the conventional memory allocation scheme for image processing results in poor performance, as it can cause the garbage collector to scan extremely large memory buffers which are guaranteed not to have a reference for the GC to reap.
On Linux, image buffers should be mapped through the HugeTLB facility, which will avoid spilling all of the slots in the translation look-aside buffer while you process images. Spilling them results in poor performance. Buffers allocated this way should, of course, be freed when timely.
Depending on the width of your CPUs cache lines, and the number of cache slots that hash to the same address (the "ways"), image processing can result in frequent cache spills. For example, a 2-way cache will always spill during a three-operand operation if the start addresses hash to the same cache slot, which will probably be true if they all start on a page boundary. The solution for this is to stagger the beginning of an image buffer by a small random integer times the size of your CPU's cache line. This is obviously machine-specific.
The text was updated successfully, but these errors were encountered:
This best practises are related to performance, which already has a guide.
They can be useful additions, even though they require even deeper understanding of computer science.
Can this points be illustated with simple example?
Probably the best approach is to create a shard for image buffer allocation, and recommend its use. I have limited time to save the world at the moment - trying to get a startup off the ground before I go broke - but I'll keep this on my queue.
There is room for documentation of best practices, either in the Crystal book or a wiki somewhere. The particular one I have hit is regarding image processing:
HugeTLB
facility, which will avoid spilling all of the slots in the translation look-aside buffer while you process images. Spilling them results in poor performance. Buffers allocated this way should, of course, be freed when timely.The text was updated successfully, but these errors were encountered: