-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migration to Pillow and huge performance improvements #137
Open
QSchulz
wants to merge
11
commits into
Psycojoker:master
Choose a base branch
from
QSchulz:multiprocess
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
QSchulz
force-pushed
the
multiprocess
branch
3 times, most recently
from
March 21, 2021 13:51
75cae06
to
12194b5
Compare
This move makes sense if one wants to reuse remove_superficial_options since it can be not specific to cache.py only. This prepares prosopopee for Pillow support. Signed-off-by: Quentin Schulz <[email protected]>
Dry runs (`prosopopee test`) shouldn't dump the cache since nothing's done except creating the HTML files which means the cache is more or less meaningless in that case. Let's dump the cache only when doing a normal build run. Signed-off-by: Quentin Schulz <[email protected]>
For images, calls to copy() is only needed when later in the template {{ image }} is used. Removing those copy() as they trigger creation of thumbnails that will never be used. Signed-off-by: Quentin Schulz <[email protected]>
Big gallery covers should be used for lines where only one gallery cover appears. With the current logic, if there is a prime number of galleries (except 2 and 3), first one and all galleries whose index is prime (except 2nd and 3rd) will have a big cover. In the end, all it matters is that if the galleries_line contains only one gallery, that gallery should have a big cover. Signed-off-by: Quentin Schulz <[email protected]>
Loggers work by hierarchy. The parent always overrides whatever the child logger has already defined. This applies to the loglevel, which is changed in prosopopee according to the --log-level argument. Since the root logger (gotten with logger = logging.getLogger()) is the parent of ALL loggers which could be declared in any third party module, prosopopee's loglevel also applies to those modules which is usually not wanted especially when prosopopee's default loglevel is the highest available. This is very annoying with Pillow since it's pretty verbose when saving files. Instead, let's declare a logger for prosopopee only. Unfortunately, since the package layout is unconventional (all *.py files in the same directory, instead of subdirs), the recommended logger = logging.getLogger(__name__) cannot be used because __name__ is __main__ in prosopopee.py, and the filename of the file in which it is used (e.g. in cache.py, it'll be cache). Which means they're not related in the eyes of the logging module and prosopopee.py's loglevel will not apply to other *.py files in the project. Instead the expected value of __name__ for more conventional packaging layouts is simulated by appending prosopopee. in front of __name__ except for prosopopee.py which is the parent logger and thus will be simply named prosopopee. Since prosopopee's logger is not the root logger anymore, NOTSET loglevel cannot be used anymore because its meaning is basically "offload messages to parent logger" and the root logger has a default loglevel of WARNING, meaning prosopopee's default loglevel will not print anything labelled as INFO or DEBUG. c.f. https://stackoverflow.com/a/50755200 Signed-off-by: Quentin Schulz <[email protected]>
In order to prepare for multiprocess support, migrate Cache.cache from a simple dict to a Manager().dict which is one of the data type that can be modified safely from other processes. Signed-off-by: Quentin Schulz <[email protected]>
…uration json.dumps() which is used to write the cache dict to a file transforms tuples into a list. With the current implementation, if a tuple is supposed to be cached, the needs_to_be_generated method will always return True even though it might not be correct. In order to support tuples in cache entries, let's pass the options passed as parameter to the method through json.loads(json.dumps()) to have the same format between cached options and to-be-compared options. This will be used in a later commit which adds a tuple (width, height) to the cache. Signed-off-by: Quentin Schulz <[email protected]>
Currently, thumbnail generation is done in a single thread while parsing the galleries by calling graphicsmagick for every thumbnail to be generated. This is suboptimal even though graphicsmagick spreads its payload over all available CPU cores. After a quick and dirty benchmarking, it was found that multiprocessed Pillow for generating thumbnails was much more efficient than graphicsmagick. This patch adds support for generation of tuhmbnails with multiprocessed Pillow. Multiple processes have to be used and not multiple threads because Python still uses the Global Interpreter Lock (GIL) for threads, meaning they cannot concurrently be running, which is what one wants for CPU intensive tasks such as thumbnail generation. Multiprocess brings its own set of challenges because most data structures cannot be shared between processes, such as the cache for example. All data modified by any of the processes should be of a type handled by multiprocess.Manager data structures. In order to have the best performances, all thumbnails for an image should be generated at once, so that the original image is opened only once. This therefore requires to keep track of images and add thumbnails to be created to the original image. This can be done via a factory which is passed to the Jinja templates so that they can request thumbnails for given images without knowing more than the original path, name of the original image and the parameters of the thumbnails to create. The ImageFactory keeps all of those original images in a dictionary which consists of a virtual path made from the original image name and a CRC32 of all the options that applies to its thumbnails. This gives prosopopee the ability to group thumbnails per options (e.g. if options are passed in gallery settings.yaml). The original image (or BaseImage) is returned by the ImageFactory and the templates can then request .copy() or .thumbnail() for it. The thumbnails are kept in a dictionary whose keys are the name of the thumbnail which is made out of the original name plus its size and the crc32 of the original image and the options that apply to it. This way, thumbnails are guaranteed to be unique even if requested multiple times by templates. The size is now read with imagesize.getsize() only once when ratio property or .copy() is called on the image so that the performance impact is minimal. A notable change is that the resize option for images only accepts percentages for now. Another notable change is that the .copy() function actually also applies the quality setting, unlike the implementation with graphicsmagick. Since multiprocess.Pool.map splits iterables into pre-defined chunks which are then assigned to processes, it is needed for best performance to have processes with more or less the same taskload so that one or more processes aren't idle when one is working 100%. For that, the original images whose thumbnails are all cached should be removed from the list of images to generate thumbnails from before the list is passed to multiprocess.Pool.map so that each process has more or less the same taskload. This has been tested against Pillow from 6.0.0 to 8.1.0 and Pillow-SIMD 7.0.0.post3. Signed-off-by: Quentin Schulz <[email protected]>
Signed-off-by: Quentin Schulz <[email protected]>
Signed-off-by: Quentin Schulz <[email protected]>
…ration Generating thumbnails is done in parallel threads via multiprocessing.Pool. By default, Pool schedules tasks on as many threads as there are cpu threads on the host machine. Let's allow users to select the number of threads Pool can use. Signed-off-by: Quentin Schulz <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
A notable change is that the resize option for images only accepts
percentages for now.
Another notable change is that the .copy() function actually also
applies the quality setting, unlike the implementation with
graphicsmagick.
This has been tested against Pillow from 6.0.0 to 8.1.0 and Pillow-SIMD
7.0.0.post3.
Here are the different benchmarks. The setup is the following: ~1400 photos spread among 31 galleries. Building everything from scratch. Graphicsmagick means the current implementation in prosopopee. "Built Pillow 8.1.0" can be reproduced by installing libjpegturbo and then running
pip3 install --no-binary :all: --force-reinstall pillow
. Please follow https://pillow.readthedocs.io/en/stable/installation.html#building-from-source to make sure you have all the packages installed in your distribution prior to trying to compile Pillow.Regenerating only one gallery of 71 photos:
Currently, thumbnail generation is done in a single thread while parsing
the galleries by calling graphicsmagick for every thumbnail to be
generated. This is suboptimal even though graphicsmagick spreads its
payload over all available CPU cores.
After a quick and dirty benchmarking, it was found that multiprocessed
Pillow for generating thumbnails was much more efficient than
graphicsmagick.
This PR adds support for generation of tuhmbnails with multiprocessed
Pillow.
Multiple processes have to be used and not multiple threads because
Python still uses the Global Interpreter Lock (GIL) for threads, meaning
they cannot concurrently be running, which is what one wants for CPU
intensive tasks such as thumbnail generation.
Multiprocess brings its own set of challenges because most data
structures cannot be shared between processes, such as the cache for
example. All data modified by any of the processes should be of a type
handled by multiprocess.Manager data structures.
In order to have the best performances, all thumbnails for an image
should be generated at once, so that the original image is opened only
once. This therefore requires to keep track of images and add thumbnails
to be created to the original image. This can be done via a factory
which is passed to the Jinja templates so that they can request
thumbnails for given images without knowing more than the original path,
name of the original image and the parameters of the thumbnails to
create.
The ImageFactory keeps all of those original images in a dictionary
which consists of a virtual path made from the original image name and a
CRC32 of all the options that applies to its thumbnails. This gives
prosopopee the ability to group thumbnails per options (e.g. if options
are passed in gallery settings.yaml).
The original image (or BaseImage) is returned by the ImageFactory and
the templates can then request .copy() or .thumbnail() for it.
The thumbnails are kept in a dictionary whose keys are the name of the
thumbnail which is made out of the original name plus its size and the
crc32 of the original image and the options that apply to it. This way,
thumbnails are guaranteed to be unique even if requested multiple times
by templates.
The size is now read with imagesize.getsize() only once when ratio
property or .copy() is called on the image so that the performance impact
is minimal.
Since multiprocess.Pool.map splits iterables into pre-defined chunks
which are then assigned to processes, it is needed for best performance
to have processes with more or less the same taskload so that one or
more processes aren't idle when one is working 100%. For that, the
original images whose thumbnails are all cached should be removed from
the list of images to generate thumbnails from before the list is passed
to multiprocess.Pool.map so that each process has more or less the same
taskload.
Thanks,
Quentin