HydroLLM-Benchmark

A Specialized Benchmark Dataset for Hydrology-Focused Question-Answering

Welcome to HydroLLM-Benchmark, a repository dedicated to providing a benchmark dataset of hydrology-specific question-answer pairs. This dataset, generated using AI, is aimed at supporting research in hydrological modeling, machine learning, and data-driven water resource management. Unlike traditional benchmarks that primarily compare model performances, our focus here is to introduce a dataset that can help researchers and practitioners evaluate or develop specialized AI models in hydrology.

Overview

HydroLLM-Benchmark aims to streamline the development of domain-specific AI solutions in hydrology by offering a comprehensive benchmark dataset. Through combining foundational textbook content and a large collection of recent hydrology research articles, we created True/False, Multiple Choice, Fill in the Blanks, and Open-Ended questions. This dataset serves as a baseline resource for evaluating or training AI models in hydrology, rather than providing direct comparisons between different models.

Datasets/: Hosts CSV files containing the AI-generated questions for hydrological content, categorized by both question type and source type (textbook vs. research article).
GenerateQA/: Scripts utilized for automatically generating the question-answer pairs.
Model Results/: Example scripts that demonstrate how one might evaluate an AI model using this dataset (these are optional and for illustration).
Resources/: Contains hydrological references like the Fundamentals of Hydrology PDF used to generate textbook-based Q&A.
Utility Scripts: Files (e.g., ChapterDivider.py, post_process.py) for parsing, data cleaning, or article retrieval.

Getting Started

Clone the Repository

git clone https://github.com/uihilab/HydroLLM-Benchmark.git
cd HydroLLM-Benchmark

Feedback

We welcome your feedback, suggestions, or any issues you might encounter. Here are a few ways to reach us:

Open an Issue: Submit a GitHub issue describing your question or concern.
Pull Requests: We encourage contributions that enhance the dataset or improve the scripts.
Contact: Feel free to share ideas or request features through email or our online community.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgements

This benchmark dataset was developed by the University of Iowa Hydroinformatics Lab (UIHI Lab). We extend our gratitude to all contributors and community members who have supported this project, helping to foster innovation at the intersection of hydrology and AI.

Kizilkaya, D., Sajja, R., Sermet, Y., & Demir, I. (2025). Towards HydroLLM: A Benchmark Dataset for Hydrology-Specific Knowledge Assessment for Large Language Models. DOI: https://doi.org/10.31223/X5R410

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HydroLLM-Benchmark

Table of Contents

Overview

Getting Started

Feedback

License

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
Datasets		Datasets
GenerateQA		GenerateQA
Model Results		Model Results
Resources		Resources
ChapterDivider.py		ChapterDivider.py
LICENSE		LICENSE
PostProcessData.py		PostProcessData.py
README.md		README.md
getArticleFullText.py		getArticleFullText.py
post_process.py		post_process.py
pullArticles.py		pullArticles.py

License

uihilab/HydroLLM-Benchmark

Folders and files

Latest commit

History

Repository files navigation

HydroLLM-Benchmark

Table of Contents

Overview

Getting Started

Feedback

License

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages