diff --git a/src/pages/blog/llama3-just-got-ears.mdx b/src/pages/blog/llama3-just-got-ears.mdx index f26f0d9..d329970 100644 --- a/src/pages/blog/llama3-just-got-ears.mdx +++ b/src/pages/blog/llama3-just-got-ears.mdx @@ -84,7 +84,7 @@ We found it useful to pre-train llama3.1 on continuous speech, through rough abl | **Warmup Steps** | 20 | | **Weight Decay** | 0.01 | | **Gradient Checkpointing** | Full | -| **Max length** | 4096 | +| **Max length** | 512 | | **Precision** | bf16 | The learning rate schedule is as follows, starting with a relatively high LR for sufficient warmup.