H-TFIDF 🔍 📖

A NLP library for discriminant terms extraction in space and time

Usage

# Load H-TFIDF package
from htfidf import htfidfBestTerms
# Load scikit-learn package
from sklearn.feature_extraction.text import CountVectorizer


# Extract occurence of words in dataset
wordCount = countVectorizer.fit_transform(dataset)

# Extract the top_100 H-TFIDF at a country & month level
H-TFIDF-results = htfidfBestTerms(
    wordCount, 
    spatial_information=dataset.geo,
    spatial_level = "Country",
    temporal_level = "month",
    top_n = 100
    )

Installation 🚧

pip install htfidf

Related scientific publication

Conference	paper	description
AGILE'2021	Full paper	H-TFIDF application to COVID-19 tweets. For more information, visit the study's repository. The workflow is fully reproducible, see the related report
IJID'2022	Poster	Using H-TFIDF feature for a spatial opinion mining on COVID-19 tweets
INSTICC'2022	Paper	Feature Selection for Sentiment Classification of COVID-19 Tweets: H-TFIDF Featuring BERT

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
htfidf		htfidf
test		test
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

H-TFIDF 🔍 📖

Usage

Installation 🚧

Related scientific publication

About

Releases

Packages

Languages

remydecoupes/h-tfidf

Folders and files

Latest commit

History

Repository files navigation

H-TFIDF 🔍 📖

Usage

Installation 🚧

Related scientific publication

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages