Use a deep neural network to borrow the skills of real artists and turn your two-bit doodles into masterpieces! This project is an implementation of Semantic Style Transfer (Champandard, 2016), based on the Neural Patches algorithm (Li, 2016).
The doodle.py
script generates an image by using three or four images as inputs: the original style and its annotation, and a target content image (optional) with its annotation (a.k.a. your doodle). The algorithm then extracts annotated patches from the style image, and incrementally transfers them over to the target image based on how closely they match.
NOTE: This project is possible thanks to the nucl.ai Conference on July 18-20. Join us in Vienna!
The algorithm is built for style transfer, but can also generate image analogies that we call a #NeuralDoodle
; use the hashtag if you post your images! Example files are included in the #/samples/
folder. Execute with these commands:
# Synthesize a coastline as if painted by Monet. This uses "*_sem.png" masks for both images.
python3 doodle.py --style samples/Monet.jpg --output samples/Coastline.png \
--device=cpu --iterations=40
# Generate a scene around a lake in the style of a Renoir painting.
python3 doodle.py --style samples/Renoir.jpg --output samples/Landscape.png \
--device=gpu0 --iterations=80
Note the --device
argument that lets you specify which GPU or CPU to use. For the samples above, here are the performance results:
- GPU Rendering — Assuming you have CUDA and enough on-board RAM, the process should complete in less than 10 minutes, even with twice the iterations.
- CPU Rendering — This will take hours and hours, even up to 12h on older hardware. To match quality it'd take twice the time. Do multiple runs in parallel!
The default is to use cpu
, if you have NVIDIA card setup with CUDA already try gpu0
. On the CPU, you can also set environment variable to OMP_NUM_THREADS=4
, but we've found the speed improvements to be minimal.
This project requires Python 3.4+ and you'll also need numpy
and scipy
(numerical computing libraries) as well as python3-dev
installed system-wide. Afterward fetching the repository, you can run the following commands from your terminal to setup a local environment:
# Create a local environment for Python 3.x to install dependencies here.
python3 -m venv pyvenv --system-site-packages
# If you're using bash, make this the active version of Python.
source pyvenv/bin/activate
# Setup the required dependencies simply using the PIP module.
python3 -m pip install --ignore-installed -r requirements.txt
After this, you should have scikit-image
, theano
and lasagne
installed in your virtual environment. You'll also need to download this pre-trained neural network (VGG19, 80Mb) for the script to run. Once you're done you can just delete the #/pyvenv/
folder.
You'll need a good NVIDIA card with CUDA to run this software on GPU, ideally 2Gb / 4Gb or better still, 8Gb to 12Gb for larger resolutions. The code does work on CPU by default, so use that as fallback since you likely have more system RAM!
To improve memory consumption, you can also install NVIDIA's cudnn
library version 3.0 or 4.0. This allows convolutional neural networks to run faster and save space in GPU RAM.
FIX Use --device=cpu
to use main system memory.
This happens when you're running without a GPU, and the CPU libraries were not found (e.g. libblas
). The neural network expressions cannot be evaluated by Theano and it's raising an exception.
FIX sudo apt-get install libblas-dev libopenblas-dev
You need to install Lasagne and Theano directly from the versions specified in requirements.txt
, rather than from the PIP versions. These alternatives are older and don't have the required features.
FIX python3 -m pip install -r requirements.txt
It seems your terminal is misconfigured and not compatible with the way Python treats locales. You may need to change this in your .bash_rc
or other startup script. Alternatively, this command will fix it once for this shell instance.
FIX export LC_ALL=en_US.UTF-8
It's possible there's a platform bug in the underlying libraries or compiler, which has been reported on MacOS El Capitan. It's not clear how to fix it, but you can try to disable optimizations to prevent the bug. (See Issue #8.)
FIX Use --safe-mode
flag to disable optimizations.
It's still too early to say definitively, both approaches were discovered independently in 2016 by @alexjc and @awentzonline (respectively). Here are some early impressions:
- One algorithm is style transfer that happens to do analogies, and the other is analogies that happens to do style transfer now. Adam extended his implementation to use a content loss after the Semantic Style Transfer paper was published, so now they're even more similar under the hood!
- Both use a patch-based approach (Li, 2016) but semantic style transfer imposes a "prior" via the patch-selection process and neural analogies has an additional prior on the convolution activations. The outputs for both algorithms are a little different, it's not yet clear where each one is best.
- Semantic style transfer is simpler, it has fewer loss components. This means somewhat less code to write and there are fewer parameters involved (not necessarily positive or negative). Neural analogies is a little more complex, with as many parameters as the combination of two algorithms.
- Neural analogies is designed to work with images, and can only support the RGB format for its masks. Semantic style transfer was designed to integrate with other neural networks (for pixel labeling and semantic segmentation), and can use any format for its maps, including RGBA or many channels per label masks.
- Semantic style transfer is about 25% faster and uses less memory too. For neural analogies, the extra computation is effectively the analogy prior — which could improve the quality of the results in theory. In practice, it's hard to tell at this stage and more testing is needed.
If you have any comparisons or insights, be sure to let us know!