Unofficial implementation of the paper Bag of Tricks for Efficient Text Classification by Joulin et al.
FastText requires Python 3 with Keras installed.
Obtain the Yelp Dataset from here and
place yelp_academic_dataset_review.json
in the base directory.
Train the model using the following command:
./train.py
It generates data.csv
which represents the model's embedding space of the
validation set. It is obtained by removing the last layer of the model and using
t-SNE for the dimensionality reduction.
index.html
implements a D3 visualisation to view the embedding space. You need
to run a local web server because browsers don't allow file accesses:
python -m http.server 8000
Now point your browser to: localhost:8000.
FastText is licensed under the terms of the Apache v2.0 license.
- Ihor Kroosh
- Tim Nieradzik