Skip to content

Commit

Permalink
CNN backward pass (#99)
Browse files Browse the repository at this point in the history
* Concrete backward pass for the maxpool2d layer

* Declare and allocate internal gradients; backward pass in progress

* Tidy up dw calculation

* 3-d activation functions for the conv2d layer

* Backward pass for the conv2d layer, first implementation

* Consistent notation in comments

* Make maxpool2d backward pass pure

* conv2d % update() method and integrate with the layer type backward pass

* Begin work on CNN training test

* Set the layer_shape attribute of the reshape layer

* Add a TODO comment for the reshape layer error checking

* Add an example for training a CNN on MNIST data

* Reorganize examples

* Update the README

* Clean up README

* Bump version to 0.9.0

* Wrap the backward pass for maxpool2d in the high-level layer backward method

* Add test for maxpool2d backward pass
  • Loading branch information
milancurcic authored Oct 13, 2022
1 parent e9af5b4 commit 9bbd70f
Show file tree
Hide file tree
Showing 20 changed files with 644 additions and 176 deletions.
40 changes: 23 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,23 +16,22 @@ Read the paper [here](https://arxiv.org/abs/1902.06714).

## Features

* Dense, fully connected neural layers
* Convolutional and max-pooling layers (experimental, forward propagation only)
* Flatten and reshape layers (forward and backward passes)
* Loading dense and convolutional models from Keras h5 files
* Training and inference of dense (fully connected) and convolutional neural
networks
* Loading dense and convolutional models from Keras HDF5 (.h5) files
* Stochastic and mini-batch gradient descent for back-propagation
* Data-based parallelism
* Several activation functions and their derivatives

### Available layer types
### Available layers

| Layer type | Constructor name | Supported input layers | Rank of output array | Forward pass | Backward pass |
|------------|------------------|------------------------|----------------------|--------------|---------------|
| Input (1-d and 3-d) | `input` | n/a | 1, 3 | n/a | n/a |
| Dense (fully-connected) | `dense` | `input1d` | 1 |||
| Convolutional (2-d) | `conv2d` | `input3d`, `conv2d`, `maxpool2d` | 3 || |
| Max-pooling (2-d) | `maxpool2d` | `input3d`, `conv2d`, `maxpool2d` | 3 || |
| Flatten | `flatten` | `input3d`, `conv2d`, `maxpool2d` | 1 |||
| Input | `input` | n/a | 1, 3 | n/a | n/a |
| Dense (fully-connected) | `dense` | `input1d`, `flatten` | 1 |||
| Convolutional (2-d) | `conv2d` | `input3d`, `conv2d`, `maxpool2d`, `reshape` | 3 || |
| Max-pooling (2-d) | `maxpool2d` | `input3d`, `conv2d`, `maxpool2d`, `reshape` | 3 || |
| Flatten | `flatten` | `input3d`, `conv2d`, `maxpool2d`, `reshape` | 1 |||
| Reshape (1-d to 3-d) | `reshape` | `input1d`, `dense`, `flatten` | 3 |||

## Getting started
Expand Down Expand Up @@ -201,10 +200,9 @@ examples, in increasing level of complexity:
1. [simple](example/simple.f90): Approximating a simple, constant data
relationship
2. [sine](example/sine.f90): Approximating a sine function
3. [mnist](example/mnist.f90): Hand-written digit recognition using the MNIST
dataset
4. [cnn](example/cnn.f90): Creating and running forward a simple CNN using
`input`, `conv2d`, `maxpool2d`, `flatten`, and `dense` layers.
3. [dense_mnist](example/dense_mnist.f90): Hand-written digit recognition
(MNIST dataset) using a dense (fully-connected) network
4. [cnn_mnist](example/cnn_mnist.f90): Training a CNN on the MNIST dataset
5. [dense_from_keras](example/dense_from_keras.f90): Creating a pre-trained
dense model from a Keras HDF5 file and running the inference.
6. [cnn_from_keras](example/cnn_from_keras.f90): Creating a pre-trained
Expand Down Expand Up @@ -247,10 +245,18 @@ Thanks to all open-source contributors to neural-fortran:
[@rouson](https://github.com/rouson),
and [@scivision](https://github.com/scivision).

Development of convolutional networks in neural-fortran was funded by a
contract from NASA Goddard Space Flight Center to the University of Miami.
Development of convolutional networks and Keras HDF5 adapters in
neural-fortran was funded by a contract from NASA Goddard Space Flight Center
to the University of Miami.

## Related projects

* [Fortran Keras Bridge (FKB)](https://github.com/scientific-computing/FKB)
* [rte-rrtmgp](https://github.com/peterukk/rte-rrtmgp)
by Jordan Ott provides a Python bridge between old (v0.1.0) neural-fortran
style save files and Keras's HDF5 models. As of v0.9.0, neural-fortran
implements the full feature set of FKB in pure Fortran, and in addition
supports training and inference of convolutional networks.
* [rte-rrtmgp-nn](https://github.com/peterukk/rte-rrtmgp-nn) by Peter Ukkonen
is an implementation based on old (v0.1.0) neural-fortran which optimizes for
speed and running on GPUs the memory layout and forward and backward passes of
dense layers.
4 changes: 2 additions & 2 deletions example/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
foreach(execid
cnn
cnn_mnist
cnn_from_keras
dense_mnist
dense_from_keras
mnist
simple
sine
)
Expand Down
32 changes: 0 additions & 32 deletions example/cnn.f90

This file was deleted.

71 changes: 71 additions & 0 deletions example/cnn_mnist.f90
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
program cnn_mnist

use nf, only: network, sgd, &
input, conv2d, maxpool2d, flatten, dense, reshape, &
load_mnist, label_digits

implicit none

type(network) :: net

real, allocatable :: training_images(:,:), training_labels(:)
real, allocatable :: validation_images(:,:), validation_labels(:)
real, allocatable :: testing_images(:,:), testing_labels(:)
real, allocatable :: input_reshaped(:,:,:,:)
real :: acc
logical :: ok
integer :: n
integer, parameter :: num_epochs = 10

call load_mnist(training_images, training_labels, &
validation_images, validation_labels, &
testing_images, testing_labels)

net = network([ &
input(784), &
reshape([1,28,28]), &
conv2d(filters=8, kernel_size=3, activation='relu'), &
maxpool2d(pool_size=2), &
conv2d(filters=16, kernel_size=3, activation='relu'), &
maxpool2d(pool_size=2), &
flatten(), &
dense(10, activation='softmax') &
])

call net % print_info()

epochs: do n = 1, num_epochs

call net % train( &
training_images, &
label_digits(training_labels), &
batch_size=128, &
epochs=1, &
optimizer=sgd(learning_rate=3.) &
)

if (this_image() == 1) &
print '(a,i2,a,f5.2,a)', 'Epoch ', n, ' done, Accuracy: ', accuracy( &
net, validation_images, label_digits(validation_labels)) * 100, ' %'

end do epochs

print '(a,f5.2,a)', 'Testing accuracy: ', &
accuracy(net, testing_images, label_digits(testing_labels)) * 100, '%'

contains

real function accuracy(net, x, y)
type(network), intent(in out) :: net
real, intent(in) :: x(:,:), y(:,:)
integer :: i, good
good = 0
do i = 1, size(x, dim=2)
if (all(maxloc(net % predict(x(:,i))) == maxloc(y(:,i)))) then
good = good + 1
end if
end do
accuracy = real(good) / size(x, dim=2)
end function accuracy

end program cnn_mnist
4 changes: 2 additions & 2 deletions example/mnist.f90 → example/dense_mnist.f90
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
program mnist
program dense_mnist

use nf, only: dense, input, network, sgd, label_digits, load_mnist

Expand Down Expand Up @@ -59,4 +59,4 @@ real function accuracy(net, x, y)
accuracy = real(good) / size(x, dim=2)
end function accuracy

end program mnist
end program dense_mnist
2 changes: 1 addition & 1 deletion fpm.toml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name = "neural-fortran"
version = "0.8.0"
version = "0.9.0"
license = "MIT"
author = "Milan Curcic"
maintainer = "[email protected]"
Expand Down
4 changes: 2 additions & 2 deletions src/nf/nf_activation.f90 → src/nf/nf_activation_1d.f90
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
module nf_activation
module nf_activation_1d

! A collection of activation functions and their derivatives.

Expand Down Expand Up @@ -168,4 +168,4 @@ pure function tanh_prime(x) result(res)
res = 1 - tanh(x)**2
end function tanh_prime

end module nf_activation
end module nf_activation_1d
171 changes: 171 additions & 0 deletions src/nf/nf_activation_3d.f90
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
module nf_activation_3d

! A collection of activation functions and their derivatives.

implicit none

private

public :: activation_function
public :: elu, elu_prime
public :: exponential
public :: gaussian, gaussian_prime
public :: relu, relu_prime
public :: sigmoid, sigmoid_prime
public :: softmax, softmax_prime
public :: softplus, softplus_prime
public :: step, step_prime
public :: tanhf, tanh_prime

interface
pure function activation_function(x) result(res)
real, intent(in) :: x(:,:,:)
real :: res(size(x,1),size(x,2),size(x,3))
end function activation_function
end interface

contains

pure function elu(x, alpha) result(res)
! Exponential Linear Unit (ELU) activation function.
real, intent(in) :: x(:,:,:)
real, intent(in) :: alpha
real :: res(size(x,1),size(x,2),size(x,3))
where (x >= 0)
res = x
elsewhere
res = alpha * (exp(x) - 1)
end where
end function elu

pure function elu_prime(x, alpha) result(res)
! First derivative of the Exponential Linear Unit (ELU)
! activation function.
real, intent(in) :: x(:,:,:)
real, intent(in) :: alpha
real :: res(size(x,1),size(x,2),size(x,3))
where (x >= 0)
res = 1
elsewhere
res = alpha * exp(x)
end where
end function elu_prime

pure function exponential(x) result(res)
! Exponential activation function.
real, intent(in) :: x(:,:,:)
real :: res(size(x,1),size(x,2),size(x,3))
res = exp(x)
end function exponential

pure function gaussian(x) result(res)
! Gaussian activation function.
real, intent(in) :: x(:,:,:)
real :: res(size(x,1),size(x,2),size(x,3))
res = exp(-x**2)
end function gaussian

pure function gaussian_prime(x) result(res)
! First derivative of the Gaussian activation function.
real, intent(in) :: x(:,:,:)
real :: res(size(x,1),size(x,2),size(x,3))
res = -2 * x * gaussian(x)
end function gaussian_prime

pure function relu(x) result(res)
!! Rectified Linear Unit (ReLU) activation function.
real, intent(in) :: x(:,:,:)
real :: res(size(x,1),size(x,2),size(x,3))
res = max(0., x)
end function relu

pure function relu_prime(x) result(res)
! First derivative of the Rectified Linear Unit (ReLU) activation function.
real, intent(in) :: x(:,:,:)
real :: res(size(x,1),size(x,2),size(x,3))
where (x > 0)
res = 1
elsewhere
res = 0
end where
end function relu_prime

pure function sigmoid(x) result(res)
! Sigmoid activation function.
real, intent(in) :: x(:,:,:)
real :: res(size(x,1),size(x,2),size(x,3))
res = 1 / (1 + exp(-x))
endfunction sigmoid

pure function sigmoid_prime(x) result(res)
! First derivative of the sigmoid activation function.
real, intent(in) :: x(:,:,:)
real :: res(size(x,1),size(x,2),size(x,3))
res = sigmoid(x) * (1 - sigmoid(x))
end function sigmoid_prime

pure function softmax(x) result(res)
!! Softmax activation function
real, intent(in) :: x(:,:,:)
real :: res(size(x,1),size(x,2),size(x,3))
res = exp(x - maxval(x))
res = res / sum(res)
end function softmax

pure function softmax_prime(x) result(res)
!! Derivative of the softmax activation function.
real, intent(in) :: x(:,:,:)
real :: res(size(x,1),size(x,2),size(x,3))
res = softmax(x) * (1 - softmax(x))
end function softmax_prime

pure function softplus(x) result(res)
! Softplus activation function.
real, intent(in) :: x(:,:,:)
real :: res(size(x,1),size(x,2),size(x,3))
res = log(exp(x) + 1)
end function softplus

pure function softplus_prime(x) result(res)
! First derivative of the softplus activation function.
real, intent(in) :: x(:,:,:)
real :: res(size(x,1),size(x,2),size(x,3))
res = exp(x) / (exp(x) + 1)
end function softplus_prime

pure function step(x) result(res)
! Step activation function.
real, intent(in) :: x(:,:,:)
real :: res(size(x,1),size(x,2),size(x,3))
where (x > 0)
res = 1
elsewhere
res = 0
end where
end function step

pure function step_prime(x) result(res)
! First derivative of the step activation function.
real, intent(in) :: x(:,:,:)
real :: res(size(x,1),size(x,2),size(x,3))
res = 0
end function step_prime

pure function tanhf(x) result(res)
! Tangent hyperbolic activation function.
! Same as the intrinsic tanh, but must be
! defined here so that we can use procedure
! pointer with it.
real, intent(in) :: x(:,:,:)
real :: res(size(x,1),size(x,2),size(x,3))
res = tanh(x)
end function tanhf

pure function tanh_prime(x) result(res)
! First derivative of the tanh activation function.
real, intent(in) :: x(:,:,:)
real :: res(size(x,1),size(x,2),size(x,3))
res = 1 - tanh(x)**2
end function tanh_prime

end module nf_activation_3d
Loading

0 comments on commit 9bbd70f

Please sign in to comment.