add arithmo2's results

SakanaAI · Mar 13, 2024 · a462165 · a462165
1 parent 24ab3f7
commit a462165
Show file tree

Hide file tree

Showing 2 changed files with 22 additions and 11 deletions.
diff --git a/README.md b/README.md
@@ -1,26 +1,27 @@
 # Evolutionary Optimization of Model Merging Recipes
 
-This is an official repository of [Evolutionary Optimization of Model Merging Recipes](https://arxiv.org/) to reproduce the results.
+This is an official repository of [Evolutionary Optimization of Model Merging Recipes](https://arxiv.org/TODO) to reproduce the results.
 
 ## Model Zoo
 
 ### LLM
 
-| Model | MGSM-JA (acc &uarr;) | [lm-eval-harness](https://github.com/Stability-AI/lm-evaluation-harness/tree/jp-stable) (Average &uarr;) |
-| :-- | --: | --: |
-| [shisa-gamma-7b-v1](https://huggingface.co/augmxnt/shisa-gamma-7b-v1) | 9.6 | 66.1 |
-| [WizardMath-7B-V1.1](https://huggingface.co/WizardLM/WizardMath-7B-V1.1) | 18.4 | 60.1 |
-| [Abel-7B-002](https://huggingface.co/GAIR/Abel-7B-002) | 30.0 | 56.5 |
-| [(Ours) EvoLLM-v1-JP-7B-A](https://huggingface.co/SakanaAI/EvoLLM-v1-JP-7B-A) | 52.4 | 69.0 |
-| [(Ours) EvoLLM-v1-JP-7B](https://huggingface.co/SakanaAI/EvoLLM-v1-JP-7B) | 52.0 | **70.5** |
-| [(Ours) EvoLLM-v1-JP-10B](https://huggingface.co/SakanaAI/EvoLLM-v1-JP-10B) | **55.6** | 68.2 |
+| Id. | Model | MGSM-JA (acc &uarr;) | [lm-eval-harness](https://github.com/Stability-AI/lm-evaluation-harness/tree/jp-stable) (Average &uarr;) |
+| :--: | :-- | --: | --: |
+| 1 | [Shisa Gamma 7B v1](https://huggingface.co/augmxnt/shisa-gamma-7b-v1) | 9.6 | 66.1 |
+| 2 | [WizardMath 7B V1.1](https://huggingface.co/WizardLM/WizardMath-7B-V1.1) | 18.4 | 60.1 |
+| 3 | [Abel 7B 002](https://huggingface.co/GAIR/Abel-7B-002) | 30.0 | 56.5 |
+| 4 | [Arithmo2 Mistral 7B](https://huggingface.co/upaya07/Arithmo2-Mistral-7B) | 24.0 | 56.4 |
+| 5 | [(Ours) EvoLLM-v1-JP-7B-A](https://huggingface.co/SakanaAI/EvoLLM-v1-JP-7B-A) | **52.4** | **69.0** |
+| 6 | [(Ours) EvoLLM-v1-JP-7B](https://huggingface.co/SakanaAI/EvoLLM-v1-JP-7B) | **52.0** | **70.5** |
+| 7 | [(Ours) EvoLLM-v1-JP-10B](https://huggingface.co/SakanaAI/EvoLLM-v1-JP-10B) | **55.6** | **68.2** |
 
 ### VLM
 
-| Model | Ja-VG-VQA-500 (Ja-R-L &uarr;) | JaVLM-Bench-In-the-Wild (Ja-R-L &uarr;) |
+| Model | JA-VG-VQA-500 (ROUGE-L &uarr;) | JA-VLM-Bench-In-the-Wild (ROUGE-L &uarr;) |
 | :-- | --: | --: |
 | [LLaVA-1.6-Mistral-7B](https://llava-vl.github.io/blog/2024-01-30-llava-next/) | 14.32 | 41.10 |
-| [JSVLM](https://huggingface.co/stabilityai/japanese-stable-vlm) | - | 40.50 |
+| [Japanese Stable VLM](https://huggingface.co/stabilityai/japanese-stable-vlm) | - | 40.50 |
 | [Heron BLIP Japanese StableLM Base 7B llava-620k](https://huggingface.co/turing-motors/heron-chat-blip-ja-stablelm-base-7b-v1-llava-620k)\* | 8.73 | 27.37 |
 | [(Ours) EvoVLM-v1-JP-7B](https://huggingface.co/SakanaAI/EvoVLM-v1-JP-7B) | **19.70** | **51.25** |
 

diff --git a/configs/llm/arithmo2-mistral-7b.yaml b/configs/llm/arithmo2-mistral-7b.yaml
@@ -0,0 +1,10 @@
+model:
+  target: evofactory.CausalLMWithvLLM
+  params:
+    model_path: upaya07/Arithmo2-Mistral-7B
+    model_kwargs:
+      dtype: bfloat16
+    template: ja-alpaca-cot
+
+eval:
+  target: evofactory.eval.JaMGSM