This repository hosts a prototype tool designed to analyze and aggregate FAIR (Findable, Accessible, Interoperable, and Reusable) statistics for a list of Digital Object Identifiers (DOIs). The tool currently utilizes the F-UJI FAIR checker to evaluate the FAIRness of the metadata associated with each DOI. Future versions aim to incorporate additional FAIR checkers to provide a more comprehensive analysis.
The tool processes a list of DOIs, which can be sourced from a website or fetched using a metasearch API like Crossref or DataCite. It calculates FAIR statistics for each DOI, aggregates these statistics by publication year, and identifies common metadata errors that impact FAIRness. The results are presented in an aggregated FAIR-statistic per publication year diagram and a summary of the most frequent metadata issues.
This tool also serves as a justification for metadata providers (e.g., Springer, Nature) to ensure their metadata is hosted in a machine-readable format, as this is crucial for optimal FAIRness evaluation.
Warning: The F-UJI FAIR checker must be initialized beforehand using a Docker container. Instructions for setting up the F-UJI checker can be found here. Please note that F-UJI and other FAIR checkers are in a very early beta status.
- DOI List Processing: Accepts a list of DOIs from a file or fetched via APIs like Crossref or DataCite.
- FAIR Evaluation: Uses the F-UJI FAIR checker to evaluate the FAIRness of each DOI's metadata.
- Aggregation: Aggregates FAIR statistics by publication year.
- Error Summary: Identifies and summarizes the most common metadata errors affecting FAIRness.
- Visualization: Generates an aggregated FAIR-statistic per publication year diagram.
- Python 3.x
- Docker (for running the F-UJI FAIR checker)
- Required Python packages (listed in
requirements.txt
)
- Clone the repository:
git clone https://github.com/saibotmagd/fair_stats_aggregator.git cd fair_stats_aggregator
- Install the required Python packages:
pip install -r requirements.txt
- Set up the F-UJI FAIR checker (https://github.com/FAIR-IMPACT/fuji) using Docker:
docker pull fairimpact/fuji docker run -d -p 1071:1071 fairimpact/fuji
- Prepare a list of DOIs in a text file (one DOI per line) or use an API to fetch DOIs.
- Run the tool:
There's an "example_DOI_list.txt" including the publications of the Leibniz Institute for Neurobiology Magdeburg.
python fair_stats_agg.py --doi-file path/to/doi_list.txt
- The tool will output the aggregated FAIR statistics and a summary of metadata errors.
- Aggregated FAIR-statistic per Publication Year Diagram: A visual representation of FAIR statistics aggregated by publication year.
- Metadata Error Summary: A list of the most common metadata errors affecting FAIRness.
- Justification for Metadata Providers: A summary highlighting the importance of machine-readable metadata for optimal FAIRness evaluation.
- Beta Status: The F-UJI FAIR checker and other FAIR checkers are in a very early beta status. Results may vary and should be interpreted with caution.
- Dependency on Docker: The F-UJI FAIR checker requires Docker to be initialized beforehand.
This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
- Share — Copy and redistribute the material in any medium or format.
- Adapt — Remix, transform, and build upon the material.
The licensor cannot revoke these freedoms as long as you follow the license terms.
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- NonCommercial — You may not use the material for commercial purposes.
- You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation.
- No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.
For more details, please refer to the full license text: CC BY-NC 4.0 License.