starreco stands for State-of-The-Art Review Recommendation System.
starreco is a Pytorch lightning implementation for a series of SOTA deep learning rating-based recommendation systems. This repository also serves as a part of the author's master thesis work's literature review.
- Up to 20+ recommendation models across 20 publications.
- Built on top of Pytorch lightning.
- GPU acceleration execution.
- Reducing memory usage for large sparse matrices.
- Simple and understandable code.
- Easy extension and code reusability.
Click here to get started!
Research model | Description | Reference |
---|---|---|
MF | Matrix Factorization | [1] |
GMF | Generalized Matrix Factorization | [2] |
MLP | Multilayer Perceptrons | [2] |
NeuMF | Neural Matrix Factorization | [2] |
FM | Factorization Machine | [3] |
NeuFM | Neural Factorization Machine | [4] |
WDL | Wide & Deep Learning | [5] |
DeepFM | Deep Factorization Machine | [6] |
xDeepFM | Extreme Deep Factorization Machine | [7] |
FGCNN | Feature Generation by using Convolutional Neural Network | [8] |
ONCF | Outer-based Product Neural Collaborative Filtering | [9] |
CNNDCF | Convolutional Neural Network based Deep Colloborative Filtering | [10] |
ConvMF | Convolutional Matrix Factorization | [11] |
AutoRec | AutoRec | [12] |
DeepRec | DeepRec | [13] |
CFN | Collaborative Filtering Network | [14] |
CDAE | Collaborative Denoising AutoEncoder | [15] |
CCAE | Collaborative Convolutional AutoEncoder | [16] |
SDAECF | Stacked Denoising AutoEncoder for Collaborative Filtering | [17] |
mDACF | marginalized Denoising AutoEncoder Collaborative Filtering | [18] |
GMF++ | Generalized Matrix Factorization ++ | [19] |
MLP++ | Multilayer Perceptrons ++ | [19] |
NeuMF++ | Neural Matrix Factorization ++ | [20] |
-
Movielen Dataset: A movie rating dataset collected from the Movielens websites by the GroupLensResearch Project at University of Minnesota. The datasets were collected over various time periods, depending on the sizes given. Movielen 1M Dataset** has been chosen. It contains 1,000,209 anonymous ratings of approximately 3,900 movies made by 6,040 MovieLens users who joined MovieLens in 2000.
-
Bookcrossing Dataset: The BookCrossing (BX) dataset was collected by Cai-Nicolas Ziegler in a 4-week crawl (August / September 2004) from the Book-Crossing community with kind permission from Ron Hornbaker, CTO of Humankind Systems. It contains 278,858 users (anonymized but with demographic information) providing 1,149,780 ratings (explicit / implicit) about 271,379 books.
Create virtual environment
python3 -m virtualenv env # Python 3.6 and above
Activate virtual environment
source env/bin/activate # Linux
./env/Scripts/activate # Windows
Clone and install necessary python packages
git clone https://github.com/KyleOng/star-reco
pip install -r requirements.txt
import os
import torch
from pytorch_lightning import Trainer
from pytorch_lightning.loggers import TensorBoardLogger
from pytorch_lightning.callbacks import ModelCheckpoint
from starreco.modules import *
from starreco.data import *
# data module
data_module = StarDataModule("ml-1m")
data_module.setup()
# module
module = MF([data_module.dataset.rating.num_users, data_module.dataset.rating.num_items],
"lr" = 0.007629571188584098,
"weight_decay" = 1.0643056040513936e-05)
# setup
# checkpoint callback
current_version = max(0, len(list(os.walk("checkpoints/mf")))-1)
checkpoint_callback = ModelCheckpoint(dirpath = f"checkpoints/mf/version_{current_version}",
monitor = "val_loss",
filename = "mf-{epoch:02d}-{train_loss:.4f}-{val_loss:.4f}")
# logger
logger = TensorBoardLogger("training_logs", name = "mf")
# trainer
trainer = Trainer(logger = logger,
gpus = -1 if torch.cuda.is_available() else None,
max_epochs = 100,
progress_bar_refresh_rate = 2,
callbacks=[checkpoint_callback])
trainer.fit(module, data_module)
# evaluate
module_test = MF.load_from_checkpoint(checkpoint_callback.best_model_path)
trainer.test(module_test, datamodule = data_module)
[1] Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8), 30-37.
[2] He, X., Liao, L., Zhang, H., Nie, L., Hu, X., & Chua, T. S. (2017, April). Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web (pp. 173-182).
[3] Rendle, S. (2010, December). Factorization machines. In 2010 IEEE International Conference on Data Mining (pp. 995-1000). IEEE.
[4] He, X., & Chua, T. S. (2017, August). Neural factorization machines for sparse predictive analytics. In Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval (pp. 355-364).
[5] Cheng, H. T., Koc, L., Harmsen, J., Shaked, T., Chandra, T., Aradhye, H., ... & Shah, H. (2016, September). Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems (pp. 7-10).
[6] Guo, H., Tang, R., Ye, Y., Li, Z., & He, X. (2017). DeepFM: a factorization-machine based neural network for CTR prediction. arXiv preprint arXiv:1703.04247.
[7] Lian, J., Zhou, X., Zhang, F., Chen, Z., Xie, X., & Sun, G. (2018, July). xdeepfm: Combining explicit and implicit feature interactions for recommender systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 1754-1763).
[8] Liu, B., Tang, R., Chen, Y., Yu, J., Guo, H., & Zhang, Y. (2019, May). Feature generation by convolutional neural network for click-through rate prediction. In The World Wide Web Conference (pp. 1119-1129).
[9] He, X., Du, X., Wang, X., Tian, F., Tang, J., & Chua, T. S. (2018). Outer product-based neural collaborative filtering. arXiv preprint arXiv:1808.03912.
[10] Wu, Y., Wei, J., Yin, J., Liu, X., & Zhang, J. (2020). Deep Collaborative Filtering Based on Outer Product. IEEE Access, 8, 85567-85574.
[11] Kim, D., Park, C., Oh, J., Lee, S., & Yu, H. (2016, September). Convolutional matrix factorization for document context-aware recommendation. In Proceedings of the 10th ACM conference on recommender systems (pp. 233-240).
[12] Sedhain, S., Menon, A. K., Sanner, S., & Xie, L. (2015, May). Autorec: Autoencoders meet collaborative filtering. In Proceedings of the 24th international conference on World Wide Web (pp. 111-112).
[13] Kuchaiev, O., & Ginsburg, B. (2017). Training deep autoencoders for collaborative filtering. arXiv preprint arXiv:1708.01715.
[14] Strub, F., Mary, J., & Gaudel, R. (2016). Hybrid collaborative filtering with autoencoders. arXiv preprint arXiv:1603.00806.
[15] Wu, Yao, et al. "Collaborative denoising auto-encoders for top-n recommender systems." Proceedings of the Ninth ACM International Conference on Web Search and Data Mining. ACM, 2016.
[16] Zhang, S. Z., Li, P. H., & Chen, X. N. (2019, December). Collaborative Convolution AutoEncoder for Recommendation Systems. In Proceedings of the 2019 8th International Conference on Networks, Communication and Computing (pp. 202-207).
[17] Strub, F., & Mary, J. (2015, December). Collaborative filtering with stacked denoising autoencoders and sparse inputs. In NIPS workshop on machine learning for eCommerce.
[18] Li, S., Kawale, J., & Fu, Y. (2015, October). Deep collaborative filtering via marginalized denoising auto-encoder. In Proceedings of the 24th ACM international on conference on information and knowledge management (pp. 811-820).
[19] Liu, Y., Wang, S., Khan, M. S., & He, J. (2018). A novel deep hybrid recommender system based on auto-encoder with neural collaborative filtering. Big Data Mining and Analytics, 1(3), 211-221.
[20] To be published.