NLP
A word2vec CBOW and Skip-gram implementation in PyTorch
Must-read papers on Natural Language Processing (NLP)
Dense Passage Retrieval using tensorflow-keras on TPU
A Python package to compute HONEST, a score to measure hurtful sentence completions in language models. Published at NAACL 2021.
Models for automatically transforming toxic text to neutral
Data and info for the paper "ParaDetox: Text Detoxification with Parallel Data"
Design Patterns for Fusion-Based Object Retrieval
Paper List for Style Transfer in Text
Datasets for Hate Speech Detection
Code and data of "Methods for Detoxification of Texts for the Russian Language" paper
Code for CAET5
This repo collects the articles for text attribute transfer
A list of resources about Text Style Transfer
Official code and data repository for our EMNLP 2020 long paper "Reformulating Unsupervised Style Transfer as Paraphrase Generation" (https://arxiv.org/abs/2010.05700).
Shopping Queries Dataset: A Large-Scale ESCI Benchmark for Improving Product Search
[ICLR'23] DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models
🦜🔗 Build context-aware reasoning applications
Python code for training models in the ACL paper, "Beyond BLEU:Training Neural Machine Translation with Semantic Similarity".
A very simple framework for state-of-the-art Natural Language Processing (NLP)
Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image …
ACL'2023: DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models
A modular RL library to fine-tune language models to human preferences