Polymer Named Entity Normalization

IMPORTANT NOTE: The code and data shared here is available for academic non-commercial use only

This repo contains code and data for the paper 'Machine-Guided Polymer Knowledge Extraction from the Literature Using Natural Language Processing: The Example of Named Entity Normalization' [1].

Requirements and Setup

Python 3.6+
Pytorch (version 1.5.0)
scikit-learn (version 0.22.1)
spacy (version 2.1.6)

You can install all required Python packages using the provided env.yml file using conda env create -f env.yml

Running the code

The code for normalization has been adapted from [https://github.com/iesl/expLinkage] [2]. Some of the major changes include addition of the parameterized cosine distance metric and addition of a test mode for prediction of clusters for zero-shot data. The following commands can be used to replicate the experiments in the paper.

To train the supervised clustering model in our paper

python src/trainer/train_vect_data.py --config="src/utils/Config.py" --mode="train" --resultDir="/path/to/output_dir" --clusterFile="data/input_data/fastText/labeled_polymer_clusters.tsv"

To train the baseline model described in our paper

python src/baseline/baseline_train.py --labeled_file="ata/input_data/fastText/labeled_polymer_clusters_with_name.tsv" --use_labels=True --output_dir="path/to/output_dir"

References

[1] Shetty, Pranav, and Rampi Ramprasad. "Machine-Guided Polymer Knowledge Extraction Using Natural Language Processing: The Example of Named Entity Normalization." Journal of Chemical Information and Modeling (2021).

[2] Yadav, Nishant, et al. "Supervised hierarchical clustering with exponential linkage." International Conference on Machine Learning. PMLR, 2019

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
resources		resources
src		src
LICENSE		LICENSE
README.md		README.md
env.yml		env.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Polymer Named Entity Normalization

IMPORTANT NOTE: The code and data shared here is available for academic non-commercial use only

Requirements and Setup

Running the code

References

About

Releases

Packages

Contributors 3

Languages

License

Ramprasad-Group/polymerNEN

Folders and files

Latest commit

History

Repository files navigation

Polymer Named Entity Normalization

IMPORTANT NOTE: The code and data shared here is available for academic non-commercial use only

Requirements and Setup

Running the code

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages