NLP
Well documented, unit tested, type checked and formatted implementation of a vanilla transformer - for educational purposes.
A wrapper around tensor2tensor to flexibly train, interact, and generate data for neural chatbots.
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…
An Open-Source Package for Neural Relation Extraction (NRE)
Pretrained language model with 100B parameters
NL-Augmenter 🦎 → 🐍 A Collaborative Repository of Natural Language Transformations
A very simple framework for state-of-the-art Natural Language Processing (NLP)
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Multi-Task Deep Neural Networks for Natural Language Understanding
DeFactoNLP: An Automated Fact-checking System that uses Named Entity Recognition, TF-IDF vector comparison and Decomposable Attention models.
An easy to use framework for large-scale fact-checking and question answering
Detect hallucinated tokens for conditional sequence generation.
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data…
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
This repository contains our path generation framework Co-NNECT, in which we combine two models for establishing knowledge relations and paths between concepts from sentences, as a form of explicit…
State-of-the-Art Text Embeddings
CoCo-Ex extracts meaningful concepts from natural language texts and maps them to conjunct concept nodes in ConceptNet, utilizing the maximum of relational information stored in the ConceptNet know…
Library for Knowledge Intensive Language Tasks
🪐 End-to-end NLP workflows from prototype to production
Mass-editing thousands of facts into a transformer memory (ICLR 2023)