"How machines learn" for Java developers

AKA implementation from scratch of automatic differentiation and gradient descent in Java to understand how it works

This repo is the code I've written to understand how Tensorflow/PyTorch models can "learn" using automatic differentiation and gradient descent.

I'm considering writing an article on Medium.com or some other platform like mokabyte.it to talk about it.

While the main branch focuses on SGD with scalar values, I'm exploring how to change the code to support N Dimensional arrays in the tensor branch.