Deep Learning

Lesson 5

Introduction to Convolution Neural Networks

Convolution Neural Networks are specific architecture of Neural Networks that are highly effective at dealing with image data.

Different Layers in Convolution Neural Network?

Input layer: The input layer contains the image dat given as input to the Convolution Neural Network. The size of an image is a 3D matrix.
Convolution layers: Convolution layers are created in neural networks when multiple image filters are applied to the input image. The multiplication between an input image and an image kernel is a dot product. Basically, a convolution layer transforms the input image data and extract features from it.
Activation Functions: The activation function is the final component of the convolutional layer, and it increases the output non-linearity. In a convolution layer, the ReLu or Tanh function is commonly employed as an activation function.

Image Kernels: Image Kernels are small matrices used to apply effects such as blurring or sharpening images. In machine learning, they are used for the feature extraction process to determine the important portions of an image. An image kernel size is samller than the input image.

Pooling layers: The pooling layer reduces the spatial size of the image representation to reduce the number of parameters and computation in the network. It operates on each feature map separately. Max pooling, average pooling and sum pooling are some of the pooling types in CNN.

* Fully Connected layer: A convolutional neural network's final layer is a fully linked layer. This layer recognizes and categorizes the image objects.

What are strides in CNN?

Stride is a filter parameter in a neural network that controls the amount of movement in an image or video. When the stride of a neural network is set to 1, for example, the filter moves one pixel (or unit) at a time.

How to determine the size of a feature map?

If an input image has a size of n x n and filters size f x f and p is the Padding amount and s is the Stride, then the dimension of the feature map is given by:

Dimension = floor[ ((n-f+2p)/s)+1] x floor[ ((n-f+2p)/s)+1]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!