This repository contains the ALEA Legal Benchmarking project, which aims to provide a comprehensive, open, and replicable benchmarking framework for AI systems in the legal domain. The project is part of the ALEA Institute's research agenda on the safe and responsible deployment of AI systems.
While there are other domain benchmarks, this project is different because:
- it provides complete data and source code for replicating generation of benchmarks
- it can be used to easily generate new samples that are truly de novo
- it does not rely on synthetic data sourced from models with use restrictions (e.g., OpenAI or Llama prior to 3.1)
- it covers both low-level (e.g., classification and span annotation) and high-level tasks (question answering, interpretation)
- it provides a simple Python library for retrieving, generating, and assessing models on new benchmarks
In general, the benchmark data generation techniques in this project fall into four categories:
- Class I: human-generated, human-annotated data
- Class II: machine-generated, human-annotated data
- Class III: human-generated, machine-annotated data
- Class IV: machine-generated, machine-annotated data
The nature of each benchmark experiment is clearly documented and indicated in the roadmap below.
- Document Classification
- soli_docs_001: Class IV
- Clause Classification
- soli_clauses_001: Class IV
- Pending
This ALEA project is released under permissive licensing:
- Source code is licensed under the MIT License. See the LICENSE file for details.
- Data distributed with this project is generated by Llama 3.1 405B. To the maximum extend possible, we release the data under the Creative Commons Attribution 4.0 International License, but the Llama 3.1 License and Acceptable Use Policy may apply to some use. Please consult the Llama 3.1 terms for details.
- Publication pending
If you encounter any issues or have questions about using this ALEA project, please open an issue on GitHub.
To learn more about ALEA and its software and research projects like KL3M, visit the ALEA website.