Features:
- Implemented a planning algorithm to globally optimize the cudnn workspace consumption and speed trade-off.
- Now
richness
parameter specifies the total memory in MBs available to cudnn for convolution workspaces. - Now the framework will try to find the best convolution algorithm combinations under memory limit.
Features:
- cuDNN v4 support
- 20% overall speed gain with faster convolution and batch normalization
- the native batch normalization is changed to comply with cuDNN. Use the script
python/bn_var_to_inv_std.py
to upgrade your models.
Features:
- python layer can expose a prefetch() method, which will be run in parallel with network processing.
Features:
- Improved cuDNN wrapper to use less GPU memory.
- Now there is a new parameter
richness
which controls the limit of workspace for cuDNN.
Features:
- Support for cuDNN v3.
Features:
- New mechanism for parallel comminucation reduced parallel overhead.
- Batch normalization, courtesy of @Cysu.
Features:
- Action recognition tools, scripts, and examples.
- Basic parallel training support
- Various extra data augmentations