Skip to content

Commit

Permalink
Improve IA3 long description (#3845)
Browse files Browse the repository at this point in the history
  • Loading branch information
arnavgarg1 authored Dec 20, 2023
1 parent e1edcbc commit 0228709
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion ludwig/schema/metadata/configs/llm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,7 @@ adapter:
type:
long_description: |
[Infused Adapter by Inhibiting and Amplifying Inner Activations](https://arxiv.org/pdf/2205.05638.pdf), or IA3,
is a method that adds three learned vectors l_k, l_v, and l_ff, to rescale the keys and values of the self-attention and encoder-decoder attention layers, and the intermediate activation of the position-wise feed-forward network respectively.
is a method that adds three learned vectors `l_k``, `l_v``, and `l_ff`, to rescale the keys and values of the self-attention and encoder-decoder attention layers, and the intermediate activation of the position-wise feed-forward network respectively. These learned vectors are the only trainable parameters during fine-tuning, and thus the original weights remain frozen. Dealing with learned vectors (as opposed to learned low-rank updates to a weight matrix like LoRA) keeps the number of trainable parameters much smaller.
target_modules:
ui_display_name: Target Modules
expected_impact: 3
Expand Down

0 comments on commit 0228709

Please sign in to comment.