bRAG AI redefines the boundaries of code-centric language models by blending cutting-edge Retrieval-Augmented Generation (RAG) techniques with parameter-efficient fine-tuning. Tailored for dynamic and evolving software ecosystems, bRAG AI leverages advanced strategies such as Low-Rank Adaptation (LoRA) and Fill-In-The-Middle (FIM) training to deliver unmatched domain-specific adaptability and precision.
(Shortened - 9 minutes) Technical Video on bRAGAI [edited]
(Full 17 minutes) Techincal Video on bRAGAI [uncut]
Explore the technical foundations of bRAG AI in our detailed research paper:
- Title: bRAG AI: Retrieval-Augmented Fine-Tuning for Code LLMs
- Author: Taha H. Ababou
- Affiliation: Boston University
Download the Research Paper (PDF)
- Efficient Fine-Tuning: Harness the power of LoRA and QLoRA to fine-tune models with minimal computational overhead.
- Dynamic Context Retrieval: Seamlessly integrate knowledge from diverse sources like GitHub repositories, academic papers, and multimedia transcripts.
- Context-Aware Reasoning: Achieve exceptional accuracy in code generation and domain-specific adaptability through real-time context enrichment.
- Engineered for Evolution: Purpose-built to address the challenges of ever-changing codebases and frameworks.
Get started with bRAG AI in a few simple steps:
# Clone the repository
git clone https://github.com/bRAGAI/bragai
# Navigate to the project directory
cd bragai
# Install dependencies
pip install -r requirements.txt
.
├── README.md # Documentation
├── eval
│ ├── code-eval
│ │ ├── codellama-base-eval.py # Evaluation script for base model
│ │ ├── codellama-bragai-eval.py # Evaluation script for fine-tuned model
│ │ ├── humaneval/ # HumanEval benchmark integration
│ │ └── requirements.txt # Dependencies for code evaluation
│ ├── rag.py # RAG evaluation script
│ ├── requirements.txt # Dependencies for evaluation scripts
│ └── zeroshot.py # Zero-shot evaluation script
├── finetune
│ ├── data
│ │ ├── README.md # Dataset preparation instructions
│ │ ├── clone_gh_repos.py # Script to clone GitHub repositories
│ │ ├── prepare_dataset.py # Script to prepare dataset for fine-tuning
│ │ ├── push_to_hub.py # Script to upload datasets to Hugging Face Hub
│ │ └── requirements.txt # Dependencies for dataset preparation
│ ├── inference/ # Inference-related scripts
│ └── training
│ ├── fim.py # Implements FIM (Fill-in-The-Middle) techniques
│ ├── requirements.txt # Dependencies for training
│ ├── run_peft.sh # Shell script for PEFT training
│ └── train.py # Fine-tuning script
If bRAG AI contributes to your research, please use the following citation:
@article{ababou2024bragAI,
title={bRAG AI: Retrieval-Augmented Fine-Tuning for Code LLMs},
author={Ababou, Taha H.},
year={2024},
institution={Boston University}
}