Arguments to put in network_args
for kohya sd scripts
- Set with
algo=ALGO_NAME
- Check List of Implemented Algorithms for algorithms to use
- Set with
preset=PRESET/CONFIG_FILE
- Pre-implemented:
full
(default),attn-mlp
,attn-only
etc. - Valid for all but (IA)^3
- Use
preset=xxx.toml
to choose config file (for LyCORIS module settings) - More info in Preset
- Dimension of the linear layers is set with the script argument
network_dim
- Dimension of the convolutional layers is set with
conv_dim=INT
- Valid for all but (IA)^3 and native fine-tuning
- For LoKr, setting dimension to sufficiently large value (>10240/2) prevents the second block from being further decomposed
- Alpha of the linear layers is set with the script argument
network_alpha
- Alpha of the convolutional layers is set with
conv_alpha=FLOAT
- Valid for all but (IA)^3 and native fine-tuning, ignored by full dimension LoKr as well
- Merge ratio is alpha/dimension, check Appendix B.1 of our paper for relation between alpha and learning rate / initialization
- Set with
dropout=FLOAT
,rank_dropout=FLOAT
,module_dropout=FLOAT
- Set the dropout rate, the types of dropout that are valid could vary from method to method
- Set with
factor=INT
- Valid for LoKr
- Use
-1
to get the smallest decomposition
- Enabled with
decompose_both=True
- Valid for LoKr
- Perform LoRA decomposition of both matrices resulting from LoKr decomposition (by default only the larger matrix is decomposed)
- Set with
block_size=INT
- Valid for DyLoRA
- Set the "unit" of DyLoRA (i.e. how many rows / columns to update each time)
- Enabled with
use_tucker=True
- Valid for all but (IA)^3 and native fine-tuning
- It was given the wrong name
use_cp=
in older version
- Enabled with
use_scalar=True
- Valid for LoRA, LoHa, and LoKr.
- Train an additional scalar in front of the weight difference
- Use a different weight initialization strategy
- Enabled with
dora_wd=True
- Valid for LoRA, LoHa, and LoKr
- Enable the DoRA method for these algorithms.
- Will force
bypass_mode=False
- Enabled with
bypass_mode=True
- Valid for LoRA, LoHa, LoKr
- Use
$Y = WX + \Delta WX$ instead of$Y=(W+\Delta W)X$ - Designed for bnb 8bit/4bit linear layer. (QLyCORIS)
- Enabled with
train_norm=True
- Valid for all but (IA)^3
- Enabled with
rescaled=True
- Valid for Diag-OFT
- Enabled with
constraint=FLOAT
- Valid for Diag-OFT