Add custom loss functions and a R/W state matrix

![Sleepy](https://media1.tenor.com/m/tIuprgLY4c4AAAAC/sleeping-drool.gif)

## Summary

[Feature branch](https://github.com/voidvoxel/brain.js/tree/feature/loss-function)

Wow. I'm mentally and physically drained after implementing this, but it was ***SOOOOOOOOOOOOO WORTH IT!!!!***

I added support for defining custom loss functions, as well as a R/W state matrix to serve as RAM to allow for training by ruleset in addition to, or even in replacement of, a pre-existing set of training data. In my own benchmarks, a custom loss function designed to train for XOR improved training on the XOR data set by approximately 100% for `NeuralNetwork` *(CPU)* and 400% for `NeuralNetworkGPU` *(GPU)*. This is, of course, a really specific example. However, it's just that; an example. This paves the way for any custom loss function to be defined. On GPU, you obviously are limited to values you pass through the `NeuralNetwork.ram` property, which is the R/W state matrix I previously mentioned. However, on CPU, you're honestly not limited by anything, as the function is called from the CPU, not the GPU, meaning you could hypothetically even go so far as to involve API calls in the loss function calculation if you really wanted to so long as your internet bandwidth permits without becoming too major of a bottleneck in training times.

### Basic examples

#### Feed-forward
```js
// Create a new neural network.
const net = new NeuralNetwork();

// Fabricate some training data.
const trainingData = [
  [0, 0, 0],
  [0, 1, 1],
  [1, 0, 1],
  [1, 1, 0]
].map(v => ({ input: v.slice(0, 2), output: v.slice(2) }) );

// A custom loss function designed to train for XOR calculations.
// This function is so effective that you could actually train on
// random input data *(`Math.random()` as input data, for example)*
// and the neural network would still come to the correct conclusion
// with little *(if any)* difference in training times.
function loss(actual, expected, inputs, ram) {
  // Calculate the base loss. This is just a normal loss function so far.
  const loss = expected - actual;
  // Reward positive behavior by providing lower error values if the neural network predicts the calculation correctly.
  if (Math.round(actual) !== Math.round(inputs[0]) ^ Math.round(inputs[1]) loss *= 20;
  // Return the calculated loss.
  return loss; 
}

// Define the training options.
const trainOptions = {
  errorThresh: 0.011,
  iterations: 15000,
  loss
};

// Train the neural network using the custom loss function.
net.train(trainingData, trainOptions);

// Calculate a ^ b
function xor(a, b) {
  return Math.round(net.run([a, b])[0]);
}

// Try it out!
console.log(xor(0, 0));
console.log(xor(0, 1));
console.log(xor(1, 0));
console.log(xor(1, 1));
```

The `updateMemory` function is easy to implement as well. Here's a silly example that just randomizes the R/W state matrix:

```js
// Define an `updateRAM` function.
function updateRAM(
  ram,
  inputs,
  outputs,
  sizes,
  loss,
  deltaLoss
) {
  const layer = this.thread.z;
  const neuron = this.thread.y;
  const signal = this.thread.x;

  // Maintain the same signal magnitude + add a random value between 0-1.
  return ram[layer][neuron][signal] + Math.random();
}

// Set the `updateRAM` function of your neural network.
const net = new NeuralNetwork();
net.updateRAM = updateRAM;
```

which will cause each value in `NeuralNetwork.ram` to increase by a random value between 0-1 each time the neural network is fed input data.

### Motivation

Why are we doing this? What use cases does it support? What is the expected outcome?
Given a method of explaining not just *what* the neural network should learn, but now also ***how*** it should learn, we have the tools we need to make far-more advanced models, as well as adding some new foundational pieces, allowing for higher-level neural net types to build upon the custom loss function feature as well as the *(very-intentionally)* general-purpose R/W state matrix.

Some foreseeable use cases:
* Complex training algorithms with a state matrix that's shared between both the `updateRAM` function *(read-write; called after each run)* and the `loss` function *(read-only; called during training iterations)*.
* Reduce training times through loss function engineering.
* Lays the foundation for GPU-accelerated genetic algorithms.
* I've already rewritten a prototype of `AE` that relies solely on a custom loss function to train, and it's certainly several steps above the current implementation that we have. I'd like to divide `AE` into `Autoencoder` and `AutoencoderGPU` to properly support environments that lack a GPU as well as to inherit directly from `NeuralNetwork` and `NeuralNetworkGPU`, respectively.
* Easy way to implement a multi-armed bandit solution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add custom loss functions and a R/W state matrix #935

Summary

Basic examples

Feed-forward

Motivation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add custom loss functions and a R/W state matrix #935

Description

Summary

Basic examples

Feed-forward

Motivation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions