|
1 | 1 | # Codegemma
|
2 | 2 |
|
3 |
| -[Codegemma](https://ai.google.dev/codegemma/docs) is a family of decoder-only, text-to-text large language models for programming, built from the same research and technology used to create the [Gemini models](https://blog.google/technology/ai/google-gemini-ai/). Codegemma models have open weights and offer pre-trained variants and instruction-tuned variants. These models are well-suited for a variety of code generation tasks. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop, or your own cloud infrastructure, democratizing access to state-of-the-art AI models and helping foster innovation for everyone. |
4 |
| -For more details, refer the the [Codegemma model card](https://ai.google.dev/codegemma/docs/model_card) released by Google. |
| 3 | +[CodeGemma](https://ai.google.dev/gemma/docs/codegemma) is a family of decoder-only, text-to-text large language models for programming, built from the same research and technology used to create the [Gemini models](https://blog.google/technology/ai/google-gemini-ai/). CodeGemma models have open weights and offer pre-trained variants and instruction-tuned variants. These models are well-suited for a variety of code generation tasks. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop, or your own cloud infrastructure, democratizing access to state-of-the-art AI models and helping foster innovation for everyone. |
| 4 | +For more details, refer the the [CodeGemma model card](https://ai.google.dev/gemma/docs/codegemma/model_card) released by Google. |
5 | 5 |
|
6 | 6 |
|
7 |
| -## Customizing Gemma with NeMo Framework |
| 7 | +## Customizing CodeGemma with NeMo Framework |
8 | 8 |
|
9 |
| -Gemma models are compatible with [NeMo Framework](https://docs.nvidia.com/nemo-framework/user-guide/latest/index.html). In this repository we have two notebooks that covert different ways of customizing Gemma. |
| 9 | +CodeGemma models are compatible with [NeMo Framework](https://docs.nvidia.com/nemo-framework/user-guide/latest/index.html). In this repository we have a notebook that covers steps for customizing CodeGemma. |
10 | 10 |
|
11 | 11 | ### Paramater Efficient Fine-Tuning with LoRA
|
12 | 12 |
|
13 |
| -[LoRA tuning](https://arxiv.org/abs/2106.09685) is a parameter efficient method for fine-tuning models, where we freeze the base model parameters and update an auxilliary "adapter" with many fewer weights. At inference time, the adapter weights are combined with the base model weights to produce a new model, customized for a particular use case or dataset. Because this adapter is so much smaller than the base model, it can be trained with far fewer resources than it would take to fine-tune the entire model. In this example, we'll show you how to LoRA-tune small models like the Gemma models on a single GPU. |
| 13 | +[LoRA tuning](https://arxiv.org/abs/2106.09685) is a parameter efficient method for fine-tuning models, where we freeze the base model parameters and update an auxilliary "adapter" with many fewer weights. At inference time, the adapter weights are combined with the base model weights to produce a new model, customized for a particular use case or dataset. Because this adapter is so much smaller than the base model, it can be trained with far fewer resources than it would take to fine-tune the entire model. In this example, we'll show you how to LoRA-tune small models like the CodeGemma models on a single GPU. |
14 | 14 |
|
15 | 15 | [Get Started Here](./lora.ipynb)
|
16 | 16 |
|
17 | 17 | ### Supervised Fine-Tuning for Instruction Following (SFT)
|
18 | 18 |
|
19 |
| -Supervised Fine-Tuning (SFT) is the process of fine-tuning all of a model’s parameters on supervised data of inputs and outputs. It teaches the model how to follow user specified instructions and is typically done after model pre-training. This example will describe the steps involved in fine-tuning Gemma for instruction following. Gemma was released with a checkpoint already fine-tuned for instruction-following, but here we'll learn how we can tune our own model starting with the pre-trained checkpoint to acheive a similar outcome. |
| 19 | +Supervised Fine-Tuning (SFT) is the process of fine-tuning all of a model’s parameters on supervised data of inputs and outputs. It teaches the model how to follow user specified instructions and is typically done after model pre-training. This example will describe the steps involved in fine-tuning CodeGemma for instruction following. CodeGemma was released with a checkpoint already fine-tuned for instruction-following, but here we'll learn how we can tune our own model starting with the pre-trained checkpoint to acheive a similar outcome. |
20 | 20 |
|
21 | 21 | Full fine-tuning is more resource intensive than Low Rank adaptation, so for SFT we'll need multiple GPUs, as opposed to the single GPU used for LoRA.
|
22 | 22 |
|
|
0 commit comments