local-finetuning-calculator

final project for OSU CSE6431

Report on project findings can be found here

General setup

Install cuda from nvidia website (neither tool nor experiments will work for non-Nvidia GPU's, or if your computer doesn't have CUDA set up)

Ensure Python 3.10 or later is installed
Create a virtual environment, and install the required packages there (torch >=2.2.2, marshmallow, marshmallow_dataclass)
Run the predictor with the following command to see a full list of options:

python vram_use_predictor.py -h

Run the tool to predict outcomes for a wide range of configurations of one model and report the best configurations Note that current working directory must be at the same level as the script and the model_details folder:

python vram_use_predictor.py google/gemma_2b.json

Run the tool to predict outcomes for a constrained set of configurations for one model:

python vram_use_predictor.py google/gemma_7b.json --lora-mlp True

or

python vram_use_predictor.py google/gemma_7b.json --lora-embed False --batch-size 2 --num-configs 50

Install WSL (Windows Subsystem for Linux) if you are using Windows
Inside the Linux environment, create a Python virtual environment and install the required packages (e.g. torch >=2.2.2, bitsandbytes, trl, peft, transformers)
Execute the training runs inside the Linux environment,
3a. Either from command line (using one of the "test_gemma_?b_experiment_from_cli.py" scripts, customized as necessary to test a particular scenario)
3b. or by setting up Pycharm with a remote interpreter inside the Linux environment and then running the jupyter notebooks from Pycharm
3b-i. if using WSL, you may need to uninstall/reinstall Pycharm for it to notice WSL as a possible source of remote interpreters

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
model_details/google		model_details/google
.gitignore		.gitignore
README.md		README.md
gemma_2b_experiments.ipynb		gemma_2b_experiments.ipynb
gemma_7b_experiments.ipynb		gemma_7b_experiments.ipynb
model_data.py		model_data.py
test_gemma_2b_experiment_from_cli.py		test_gemma_2b_experiment_from_cli.py
test_gemma_7b_experiment_from_cli.py		test_gemma_7b_experiment_from_cli.py
vram_predictor_report.pdf		vram_predictor_report.pdf
vram_use_predictor.py		vram_use_predictor.py