Skip to content

CodyNicholson/Fast_Style_Transfer_Demo_Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fast Style Transfer in TensorFlow

Add styles from famous paintings to any photo in a fraction of a second! You can even style videos!

It takes 100ms on a 2015 Titan X to style the MIT Stata Center (1024×680) like Udnie, by Francis Picabia.

Our implementation is based off of a combination of Gatys' A Neural Algorithm of Artistic Style, Johnson's Perceptual Losses for Real-Time Style Transfer and Super-Resolution, and Ulyanov's Instance Normalization.

Don't have access to a GPU but want to train a style or transform large batches of images?

Contact me at [email protected] for rates.

Video Stylization

Here we transformed every frame in a video, then combined the results. Click to go to the full demo on YouTube! The style here is Udnie, as above.

See how to generate these videos here!

Image Stylization

We added styles from various paintings to a photo of Chicago. Click on thumbnails to see full applied style images.



Implementation Details

Our implementation uses TensorFlow to train a fast style transfer network. We use roughly the same transformation network as described in Johnson, except that batch normalization is replaced with Ulyanov's instance normalization, and the scaling/offset of the output tanh layer is slightly different. We use a loss function close to the one described in Gatys, using VGG19 instead of VGG16 and typically using "shallower" layers than in Johnson's implementation (e.g. we use relu1_1 rather than relu1_2). Empirically, this results in larger scale style features in transformations.

Documentation

Training Style Transfer Networks

Use style.py to train a new style transfer network. Run python style.py to view all the possible parameters. Training takes 4-6 hours on a Maxwell Titan X. More detailed documentation here. Before you run this, you should run setup.sh. Example usage:

python style.py --style path/to/style/img.jpg \
  --checkpoint-dir checkpoint/path \
  --test path/to/test/img.jpg \
  --test-dir path/to/test/dir \
  --content-weight 1.5e1 \
  --checkpoint-iterations 1000 \
  --batch-size 20

Evaluating Style Transfer Networks

Use evaluate.py to evaluate a style transfer network. Run python evaluate.py to view all the possible parameters. Evaluation takes 100 ms per frame (when batch size is 1) on a Maxwell Titan X. More detailed documentation here. Takes several seconds per frame on a CPU. Models for evaluation are located here. Example usage:

python evaluate.py --checkpoint path/to/style/model.ckpt \
  --in-path dir/of/test/imgs/ \
  --out-path dir/for/results/

Stylizing Video

Use transform_video.py to transfer style into a video. Run python transform_video.py to view all the possible parameters. Requires ffmpeg. More detailed documentation here. Example usage:

python transform_video.py --in-path path/to/input/vid.mp4 \
  --checkpoint path/to/style/model.ckpt \
  --out-path out/video.mp4 \
  --device /gpu:0 \
  --batch-size 4

Requirements

You will need the following to run the above:

  • TensorFlow 0.11.0
  • Python 2.7.9, Pillow 3.4.2, scipy 0.18.1, numpy 1.11.2
  • If you want to train (and don't want to wait for 4 months):
    • A decent GPU
    • All the required NVIDIA software to run TF on a GPU (cuda, etc)
  • ffmpeg 3.1.3 if you want to stylize video

Citation

  @misc{engstrom2016faststyletransfer,
    author = {Logan Engstrom},
    title = {Fast Style Transfer},
    year = {2016},
    howpublished = {\url{https://github.com/lengstrom/fast-style-transfer/}},
    note = {commit xxxxxxx}
  }

Attributions/Thanks

  • This project could not have happened without the advice (and GPU access) given by Anish Athalye.
    • The project also borrowed some code from Anish's Neural Style
  • Some readme/docs formatting was borrowed from Justin Johnson's Fast Neural Style
  • The image of the Stata Center at the very beginning of the README was taken by Juan Paulo

License

Copyright (c) 2016 Logan Engstrom. Contact me for commercial use (email: engstrom at my university's domain dot edu). Free for research/noncommercial use, as long as proper attribution is given and this copyright notice is retained.


style.py

style.py trains networks that can transfer styles from artwork into images.

Flags

  • --checkpoint-dir: Directory to save checkpoint in. Required.
  • --style: Path to style image. Required.
  • --train-path: Path to training images folder. Default: data/train2014.
  • --test: Path to content image to test network on at at every checkpoint iteration. Default: no image.
  • --test-dir: Path to directory to save test images in. Required if --test is passed a value.
  • --epochs: Epochs to train for. Default: 2.
  • --batch_size: Batch size for training. Default: 4.
  • --checkpoint-iterations: Number of iterations to go for between checkpoints. Default: 2000.
  • --vgg-path: Path to VGG19 network (default). Can pass VGG16 if you want to try out other loss functions. Default: data/imagenet-vgg-verydeep-19.mat.
  • --content-weight: Weight of content in loss function. Default: 7.5e0.
  • --style-weight: Weight of style in loss function. Default: 1e2.
  • --tv-weight: Weight of total variation term in loss function. Default: 2e2.
  • --learning-rate: Learning rate for optimizer. Default: 1e-3.
  • --slow: For debugging loss function. Direct optimization on pixels using Gatys' approach. Uses test image as content value, test_dir for saving fully optimized images.

evaluate.py

evaluate.py evaluates trained networks given a checkpoint directory. If evaluating images from a directory, every image in the directory must have the same dimensions.

Flags

  • --checkpoint: Directory or ckpt file to load checkpoint from. Required.
  • --in-path: Path of image or directory of images to transform. Required.
  • --out-path: Out path of transformed image or out directory to put transformed images from in directory (if in_path is a directory). Required.
  • --device: Device used to transform image. Default: /cpu:0.
  • --batch-size: Batch size used to evaluate images. In particular meant for directory transformations. Default: 4.
  • --allow-different-dimensions: Allow different image dimensions. Default: not enabled

transform_video.py

transform_video.py transforms videos into stylized videos given a style transfer net.

Flags

  • --checkpoint-dir: Directory or ckpt file to load checkpoint from. Required.
  • --in-path: Path to video to transfer style to. Required.
  • --out-path: Path to out video. Required.
  • --tmp-dir: Directory to put temporary processing files in. Will generate a dir if you do not pass it a path. Will delete tmpdir afterwards. Default: randomly generates invisible dir, then deletes it after execution completion.
  • --device: Device to evaluate frames with. Default: /gpu:0.
  • --batch-size: Batch size for evaluating images. Default: 4.

About

Deep Learning Foundations Nanodegree Fast Style Transfer demo project from Udacity

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published