Papers to code from scratch A Neural Probabilistic Language Model Efficient Estimation of Word Representations in Vector Space GloVe: Global Vectors for Word Representation Scaling Laws for Neural Language Models Learn PyTorch for Deep Learning: Zero to Mastery book Pytorch 0 to mastery