Welcome to Marlin! This repository is dedicated to exploring and experimenting with Rust-based solutions for deep learning inferencing. Our goal is to leverage Rust's performance and safety features to improve the efficiency of serving large deep learning models.
Marlin is an experimental project focused on evaluating Rust's capabilities in handling deep learning inference tasks. We compare Rust-based implementations with Python counterparts to assess performance, latency, and scalability.
- Rust-based Inference: Implementation of deep learning models using Rust, with emphasis on performance and thread safety.
- Model Serving: Use of
Candle
from Hugging Face andactix-web
for serving models. - Benchmarking: Tools and scripts for performance and encoding time benchmarks.
- Comparative Analysis: Side-by-side comparison with Python-based implementations to evaluate performance differences.
- Rust (1.67.0 or later)
cargo
(Rust package manager and build tool)
-
Clone the repository:
git clone https://github.com/AbhishekBose/marlin.git cd marlin
-
Build the project:
cargo build
-
Start the Rust-based server:
cargo run --release
-
Run the Python-based server (if applicable):
cd scripts pip fastapi uvicorn sentence-transformers uvicorn main:app --reload
-
Perform Load and Encoding Benchmarks:
- Use the provided benchmarking scripts in the
scripts/
directory to test performance and compare results.
- Use the provided benchmarking scripts in the
Contributions are welcome! Please open an issue or submit a pull request if you have suggestions, improvements, or bug fixes.
This project is licensed under the MIT License. See the LICENSE file for details.
For any questions or discussions, feel free to reach out to Abhishek Bose.
Happy experimenting! 🚀