|
1 |
| -# inference_engine |
2 |
| -Efficient VLM inference |
| 1 | +# Inference Engine for LLM Video Benchmarks |
| 2 | + |
| 3 | +This repository contains an inference engine designed to quickly and efficiently run video-based large language model (LLM) benchmarks. The engine leverages parallelism to maximize resource usage and minimize compute time. |
| 4 | + |
| 5 | +## Table of Contents |
| 6 | + |
| 7 | +- [Installation](#installation) |
| 8 | +- [Configuration](#configuration) |
| 9 | +- [Usage](#usage) |
| 10 | + - [Prepare Batches](#prepare-batches) |
| 11 | + - [Run Inference](#run-inference) |
| 12 | +- [Results](#results) |
| 13 | +- [Contributing](#contributing) |
| 14 | +- [License](#license) |
| 15 | + |
| 16 | +## Installation |
| 17 | + |
| 18 | +To get started, clone the repository and install the required dependencies: |
| 19 | + |
| 20 | +```bash |
| 21 | +git clone https://github.com/tensorsense/inference_engine.git |
| 22 | +cd inference_engine |
| 23 | +pip3 install -r requirements.txt |
| 24 | +``` |
| 25 | + |
| 26 | +## Configuration |
| 27 | + |
| 28 | +The engine requires a configuration file (`config.yaml`) to specify various parameters. |
| 29 | +An example `config.yaml` file is included in the repository. You can use it as a template and modify it according to your requirements. |
| 30 | + |
| 31 | +### API Keys |
| 32 | + |
| 33 | +You need to create a `.env` file in the root directory of the repository and add your API keys to it. The `.env` file should look like this: |
| 34 | + |
| 35 | +``` |
| 36 | +OPENAI_API_VERSION="2023-07-01-preview" |
| 37 | +AZURE_OPENAI_ENDPOINT=your_azure_openai_endpoint |
| 38 | +AZURE_OPENAI_API_KEY=your_openai_api_key |
| 39 | +``` |
| 40 | + |
| 41 | +## Usage |
| 42 | + |
| 43 | +### Prepare Batches |
| 44 | + |
| 45 | +The first step is to prepare batches of video-question pairs for processing. The `prepare_batches` function reads the input data and creates batches based on the configuration. |
| 46 | + |
| 47 | +### Run Inference |
| 48 | + |
| 49 | +Run the main script to start the inference process: |
| 50 | + |
| 51 | +```bash |
| 52 | +python3 eval.py |
| 53 | +``` |
| 54 | + |
| 55 | +This will: |
| 56 | +1. Set up output paths. |
| 57 | +2. Load the configuration. |
| 58 | +3. Prepare batches. |
| 59 | +4. Start local workers for LLM inference. |
| 60 | +5. Start OpenAI workers for evaluation. |
| 61 | +6. Monitor progress and save results. |
| 62 | + |
| 63 | +The engine uses multiprocessing to parallelize the processing of batches, significantly reducing the overall compute time. |
| 64 | + |
| 65 | +## Results |
| 66 | + |
| 67 | +After the inference and evaluation processes are completed, results will be saved in the specified output directory. The final results include detailed information about each question-answer pair, the model's prediction, and evaluation scores. |
| 68 | + |
| 69 | +The following metrics are computed and saved: |
| 70 | +- Average Score |
| 71 | +- Accuracy |
| 72 | +- Yes/No counts |
| 73 | + |
| 74 | +These metrics provide insights into the performance of the evaluated models. |
| 75 | + |
| 76 | +## Contributing |
| 77 | + |
| 78 | +We welcome contributions to improve the inference engine. Please submit a pull request or open an issue to discuss your ideas. |
| 79 | + |
| 80 | +## License |
| 81 | + |
| 82 | +This project is licensed under the Apache 2.0 License. See the [LICENSE](LICENSE) file for details. |
0 commit comments