Skip to content

JjjFangg/ex

Repository files navigation

UniMem_Exp

Impact of Fine-Tuning Data

We trained the TinyLLaMA-1.1B and LLaMA2-7B models using the GitHub dataset. The figures below illustrate the perplexity changes as the training data volume increases from 4 million to 104 million tokens (0.1 billion).

Figure 1: Perplexity Curves of TinyLLaMA on the GitHub Dataset

TinyLLaMA_existingMethods_trainingTokens

Figure 2: Perplexity Curves of LLaMA2-7B on the GitHub Dataset

LLaMA2-7b_existingMethods_trainingTokens

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published