Skip to content

AleSteB/CatalysisIE_Knowledge_Graph_Generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CatalysisIE based Knowledge Graph Generator

Repository for the publication "Generating knowledge graphs through text mining of catalysis research related literature". The two Excel-files listing the output of the queries as described in the publication are contained in the output folder

The tool consists following modules: preprocess_onto.py, txt_extract.py, text_mining.py, onto_extension.py also there are jupyter notebook with SPARQL queries examples and functions for querying the ontology depending on the information of interest.

Preparations

Before starting the code, some preparations must be done:

  • Folder structure must be the following:
main_folder
├── import
├── ontologies
├── ontology_snipet
├── CatalysisIE
├── PDFDataExtractor
├── robot
├── output
└── classlist

CatalysisIE Checkpoint

The checkpoint of the extended CatalysisIE model is found here: DOI

Usage

  1. Execute create_ChEBIdict.py to create a dictionary of all ChEBI classes for later entity recognition (might take some time)
  2. Place PDFs in folder import
  3. Make sure a model for
  4. Insert your Scopus API key in config.json and adjust other settings where necessary
  5. Execute run_pdfs.py (this uses modules txt_extract.py, text_mining.py, preprocess_onto.py, and onto_extension.py and stores resulting knowledge graph in ontologies)
  6. Execute the jupyter notebook user_queries.ipynb for predefined queries on the resulting knowledge graph

Remarks

The directory labeling contains json files exported from labelStudio for the labeling of abstracts of both the methanation and hydroformylation datasets. Furthermore, this directory contains the resulting labeling of the models and the performances of the models.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published