Relation extraction

For this project we had to research and implement various classifiers for prediction of relationship between entities. It was done as part of Natural Language Processing and Information Extraction course.
Main documents are located in the docs folder.

Install and Quick Start

First create a new conda environment with python 3.10 and activate it:

conda create -n tuwnlpie python=3.10
conda activate tuwnlpie

Then install this repository as a package, the -e flag installs the package in editable mode, so you can make changes to the code and they will be reflected in the package.

pip install -e .

All the requirements should be specified in the setup.py file with the needed versions. If you are not able to specify everything there you can describe the additional steps here, e.g.:

Install black library for code formatting:

pip install black

Install pytest library for testing:

pip install pytest

Run Milestone 1

Training:

To train a model on the FoodDisease dataset and then save object to a file, you can run a command:

python ./scripts/train.py -t ./data/food_disease.csv -s -sp ./models/model_milestone1.pkl -m1

Evaluation:

To evaluate the model on the dataset with a trained model, you can run a command:

python scripts/evaluate.py -t data/food_disease.csv -sm data/bayes_model.pkl -sp -m 1

Run Milestone 2

Training:

To train the neural network on the IMDB dataset and then save the weights to a file, you can run a command:

python ./scripts/train.py -t ./data/crowd_truth_combined.csv -sdp -s -sp ./models/model_milestone2.pt -m2

Evaluation:

To evaluate the model on the dataset with the trained weights, you can run a command (you can also provide a pretrained model, so if someone wants to evaluate your model, they can do it without training it):

# python scripts/evaluate.py -t data/imdb_dataset_sample.csv -sm data/bow_model.pt -sp -m 2

Running the tests

For testing we use the pytest package (you should have it installed with the command provided above). You can run the tests with the following command:

pytest

Code formatting

To convey a consistent coding style across the project, we advise using a code formatter to format your code. For code formatting we use the black package (you should have it installed with the command provided above). You can format your code with the following command:

black .

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
.github		.github
data		data
docs		docs
images		images
scripts		scripts
tests		tests
tuwnlpie		tuwnlpie
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py
team.cfg		team.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Relation extraction

Install and Quick Start

Run Milestone 1

Run Milestone 2

Running the tests

Code formatting

About

Releases

Contributors 6

Languages

License

anaterovic/NLP

Folders and files

Latest commit

History

Repository files navigation

Relation extraction

Install and Quick Start

Run Milestone 1

Run Milestone 2

Running the tests

Code formatting

About

Resources

License

Stars

Watchers

Forks

Releases

Contributors 6

Languages