A Dense Bi-Encoder for Information Retrieval library for experimenting and using neural models (with a particular emphasis on bi-encoder models) for end-to-end ranking of documents.
- Python >= 3.10
- GPU hardware compatible with pytorch is encouraged
- Otherwise requirements for your index such as storage and CPU usage should be considered
It is recommended to set up a virtual environment and install from source
python3 -m venv venv
source venv/bin/activate
pip install git+https://github.com/Ayuei/DeBEIR.git
# Sentence Segmentation Model install
pip install https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.5.0/en_core_sci_md-0.5.0.tar.gz
The library has an emphasis on reproducibility and experimentation. With this in mind, settings are placed into configuration files to be used to build the pipeline.
from debeir.interfaces.pipeline import NIRPipeline
p = NIRPipeline.build_from_config(config_fp="./tests/config.toml",
engine="elasticsearch",
nir_config_fp="./tests/nir_config.toml")
# The cosine offset ensures a non-negative score.
results = await p.run_pipeline(cosine_offset=1.0)
See examples/
for more use cases and where to get started.
API Documentation for the library with rendered HTML documentation is available at https://ayuei.github.io/DeBEIR/debeir.html which is built with the pdoc3 library and is rebuilt with every commit with gh-pages.
Statically compiled documentation (which is updated less frequently) can be found in the top level directory docs/index.html.
You can also build this documentation with the pdoc library by executing the following commands:
pip install -r requirements-docs.txt
pdoc -o docs/ src/debeir/
If you use to help with development of the library, first verify the tests cases and set up a development environment. This will take approximately 30 minutes to complete on a mid-range system.
Requires: Docker and pip installation of requirements-dev.txt packages.
virtualenv venv
source virtualenv/venv/activate
pip install -r requirements-dev.txt
cd tests/
./build_test_env.sh
pytest .
A helper script for removing the development environment is provided in tests/cleanup.sh
If you have any issue with the current library, please file an issue create an issue.
For those wanting to contribute to the library, please see CONTRIBUTING.md and submit a pull request!
If you wish to reach out to the author and maintainer of this library, please email [email protected]