[Question] How can an individual layer be merged from model into another? #476

T145 · 2024-12-26T21:54:57Z

T145
Dec 26, 2024

I made the following configuration:

base_model: T145/ZEUS-8B-V10
dtype: bfloat16
merge_method: linear
parameters:
  int8_mask: 1.0
  normalize: 1.0
  weight:
  - filter: lm_head
    value: 0.0
  - value: 1.0
slices:
- sources:
  - layer_range: [0, 32]
    model: SicariusSicariiStuff/LLAMA-3_8B_Unaligned_BETA
    parameters:
      weight:
      - filter: lm_head
        value: 1.0
      - value: 0.0
  - layer_range: [0, 32]
    model: T145/ZEUS-8B-V10
tokenizer_source: base

However after running benchmarks, the result performed significantly worse than the base model. To my understanding this config should preserve the lm_head from the target model and everything else from the base model. Is there something wrong I did here, or are drastic changes to be expected?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] How can an individual layer be merged from model into another? #476

{{title}}

Replies: 0 comments

Select a reply

[Question] How can an individual layer be merged from model into another? #476

T145 Dec 26, 2024

Replies: 0 comments

T145
Dec 26, 2024