Skip to content

Commit bdbafce

Browse files
model: Typo corrections (#106)
Co-authored-by: Anjali Shah <[email protected]>
1 parent 1d2d197 commit bdbafce

File tree

1 file changed

+6
-6
lines changed

1 file changed

+6
-6
lines changed

models/Codegemma/README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,22 @@
11
# Codegemma
22

3-
[Codegemma](https://ai.google.dev/codegemma/docs) is a family of decoder-only, text-to-text large language models for programming, built from the same research and technology used to create the [Gemini models](https://blog.google/technology/ai/google-gemini-ai/). Codegemma models have open weights and offer pre-trained variants and instruction-tuned variants. These models are well-suited for a variety of code generation tasks. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop, or your own cloud infrastructure, democratizing access to state-of-the-art AI models and helping foster innovation for everyone.
4-
For more details, refer the the [Codegemma model card](https://ai.google.dev/codegemma/docs/model_card) released by Google.
3+
[CodeGemma](https://ai.google.dev/gemma/docs/codegemma) is a family of decoder-only, text-to-text large language models for programming, built from the same research and technology used to create the [Gemini models](https://blog.google/technology/ai/google-gemini-ai/). CodeGemma models have open weights and offer pre-trained variants and instruction-tuned variants. These models are well-suited for a variety of code generation tasks. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop, or your own cloud infrastructure, democratizing access to state-of-the-art AI models and helping foster innovation for everyone.
4+
For more details, refer the the [CodeGemma model card](https://ai.google.dev/gemma/docs/codegemma/model_card) released by Google.
55

66

7-
## Customizing Gemma with NeMo Framework
7+
## Customizing CodeGemma with NeMo Framework
88

9-
Gemma models are compatible with [NeMo Framework](https://docs.nvidia.com/nemo-framework/user-guide/latest/index.html). In this repository we have two notebooks that covert different ways of customizing Gemma.
9+
CodeGemma models are compatible with [NeMo Framework](https://docs.nvidia.com/nemo-framework/user-guide/latest/index.html). In this repository we have a notebook that covers steps for customizing CodeGemma.
1010

1111
### Paramater Efficient Fine-Tuning with LoRA
1212

13-
[LoRA tuning](https://arxiv.org/abs/2106.09685) is a parameter efficient method for fine-tuning models, where we freeze the base model parameters and update an auxilliary "adapter" with many fewer weights. At inference time, the adapter weights are combined with the base model weights to produce a new model, customized for a particular use case or dataset. Because this adapter is so much smaller than the base model, it can be trained with far fewer resources than it would take to fine-tune the entire model. In this example, we'll show you how to LoRA-tune small models like the Gemma models on a single GPU.
13+
[LoRA tuning](https://arxiv.org/abs/2106.09685) is a parameter efficient method for fine-tuning models, where we freeze the base model parameters and update an auxilliary "adapter" with many fewer weights. At inference time, the adapter weights are combined with the base model weights to produce a new model, customized for a particular use case or dataset. Because this adapter is so much smaller than the base model, it can be trained with far fewer resources than it would take to fine-tune the entire model. In this example, we'll show you how to LoRA-tune small models like the CodeGemma models on a single GPU.
1414

1515
[Get Started Here](./lora.ipynb)
1616

1717
### Supervised Fine-Tuning for Instruction Following (SFT)
1818

19-
Supervised Fine-Tuning (SFT) is the process of fine-tuning all of a model’s parameters on supervised data of inputs and outputs. It teaches the model how to follow user specified instructions and is typically done after model pre-training. This example will describe the steps involved in fine-tuning Gemma for instruction following. Gemma was released with a checkpoint already fine-tuned for instruction-following, but here we'll learn how we can tune our own model starting with the pre-trained checkpoint to acheive a similar outcome.
19+
Supervised Fine-Tuning (SFT) is the process of fine-tuning all of a model’s parameters on supervised data of inputs and outputs. It teaches the model how to follow user specified instructions and is typically done after model pre-training. This example will describe the steps involved in fine-tuning CodeGemma for instruction following. CodeGemma was released with a checkpoint already fine-tuned for instruction-following, but here we'll learn how we can tune our own model starting with the pre-trained checkpoint to acheive a similar outcome.
2020

2121
Full fine-tuning is more resource intensive than Low Rank adaptation, so for SFT we'll need multiple GPUs, as opposed to the single GPU used for LoRA.
2222

0 commit comments

Comments
 (0)