Merge pull request #326 from Anhforth/merge_aquila

Merge aquila
FlagAI-Open · Jun 9, 2023 · deeae84 · deeae84
2 parents 46127c9 + dd72e8a
commit deeae84
Show file tree

Hide file tree

Showing 24 changed files with 691 additions and 231 deletions.
diff --git a/README.md b/README.md
@@ -19,7 +19,7 @@ FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensibl
 
     FlagAI provides an API that allows you to quickly download pre-trained models and fine-tune them on a wide range of datasets collected from [SuperGLUE](https://super.gluebenchmark.com/) and [CLUE](https://github.com/CLUEbenchmark/CLUE) benchmarks for both Chinese and English text.
 
-    FlagAI now supports over 30 mainstream models, including multilingual text and image representation model [**AltCLIP**](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/AltCLIP), text-to-image generation model [**AltDiffusion**](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/AltDiffusion) [![Huggingface space](https://img.shields.io/badge/🤗-Huggingface%20Space-cyan.svg)](https://huggingface.co/spaces/BAAI/bilingual_stable_diffusion), [**WuDao GLM**](/docs/GLM.md) (with a maximum of 10 billion parameters), [**EVA-CLIP**](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/EVA_CLIP), **OPT**, **BERT**, **RoBERTa**, **GPT2**, **T5**, **ALM**, and models from **Huggingface Transformers**, etc.
+    FlagAI now supports over 30 mainstream models, including Language Model [**Aquila**](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/Aquila), multilingual text and image representation model [**AltCLIP**](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/AltCLIP), text-to-image generation model [**AltDiffusion**](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/AltDiffusion) [![Huggingface space](https://img.shields.io/badge/🤗-Huggingface%20Space-cyan.svg)](https://huggingface.co/spaces/BAAI/bilingual_stable_diffusion), [**WuDao GLM**](/docs/GLM.md) (with a maximum of 10 billion parameters), [**EVA-CLIP**](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/EVA_CLIP), **OPT**, **BERT**, **RoBERTa**, **GPT2**, **T5**, **ALM**, and models from **Huggingface Transformers**, etc.
 
 
 2. **Parallel train with fewer than 10 lines of code**
@@ -56,6 +56,7 @@ FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensibl
 
 |   Model          |  Task    | Train | Finetune | Inference/Generate | Examples       |                                                         
 | :---------------- | :------- | :-- |:-- | :-- | :--------------------------------------------- |
+| Aquila      | Natural Language Processing  | ✅  | ✅  | ✅  | [README.md](examples/Aquila/README.md) 
 | ALM          | Arabic Text Generation  |  ✅  | ❌  | ✅  | [README.md](/examples/ALM/README.md)  |                         
 | AltCLIP       | Image-Text Matching  | ✅  | ✅  | ✅  | [README.md](/examples/AltCLIP/README.md)   |  
 | AltCLIP-m18      | Image-Text Matching  | ✅  | ✅  | ✅  | [README.md](examples/AltCLIP-m18/README.md)   |                             

diff --git a/README_zh.md b/README_zh.md
@@ -26,7 +26,7 @@
 
     提供 API 方便你快速下载模型，并在给定（中/英文）文本上使用这些预训练模型，在从[SuperGLUE](https://super.gluebenchmark.com/)和[CLUE](https://github.com/CLUEbenchmark/CLUE) benchmarks收集的广泛使用的数据集上对它们进行微调。
 
-      FlagAI 现已支持 30+ 主流模型，包括多模态模型 [**AltCLIP**](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/AltCLIP) 、文生图模型 [**AltDiffusion**](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/AltDiffusion) [![Huggingface space](https://img.shields.io/badge/🤗-Huggingface%20Space-cyan.svg)](https://huggingface.co/spaces/BAAI/bilingual_stable_diffusion)、最高百亿参数的 **[悟道GLM](/doc_zh/GLM.md)**，[**EVA-CLIP**](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/EVA_CLIP)、**[Galactica](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/galactica)**、**OPT**、**BERT**、**RoBERTa**、**GPT2**、**T5**、**ALM**、**Huggingface Transformers** 等。
+      FlagAI 现已支持 30+ 主流模型，包括语言模型[**Aquila**](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/Aquila), 多模态模型 [**AltCLIP**](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/AltCLIP) 、文生图模型 [**AltDiffusion**](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/AltDiffusion) [![Huggingface space](https://img.shields.io/badge/🤗-Huggingface%20Space-cyan.svg)](https://huggingface.co/spaces/BAAI/bilingual_stable_diffusion)、最高百亿参数的 **[悟道GLM](/doc_zh/GLM.md)**，[**EVA-CLIP**](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/EVA_CLIP)、**[Galactica](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/galactica)**、**OPT**、**BERT**、**RoBERTa**、**GPT2**、**T5**、**ALM**、**Huggingface Transformers** 等。
 
 2.  **仅用十行代码即可进行并行训练**
 
@@ -59,6 +59,7 @@
 
 |    模型名称            | 任务      | 训练 | 微调 | 推理 | 样例           |                                                         
 | :---------------- | :------- | :-- |:-- | :-- | :--------------------------------------------- |
+| Aquila      | 自然语言处理  | ✅  | ✅  | ✅  | [README.md](examples/Aquila/README.md) 
 | ALM          | 阿拉伯语文本生成   |  ✅  | ❌  | ✅  | [README.md](/examples/ALM/README.md)  |                         
 | AltCLIP       | 文图匹配 | ✅  | ✅  | ✅  | [README.md](/examples/AltCLIP/README.md)   |  
 | AltCLIP-m18      | 文图匹配  | ✅  | ✅  | ✅  | [README.md](examples/AltCLIP-m18/README.md)   |                             

diff --git a/examples/Aquila/Aquila-sft/Aquila-sft.yaml → examples/Aquila/Aquila-chat/Aquila-chat.yaml b/examples/Aquila/Aquila-sft/Aquila-sft.yaml → examples/Aquila/Aquila-chat/Aquila-chat.yaml
diff --git a/...es/Aquila/Aquila-sft/README_AquilaChat.md → examples/Aquila/Aquila-chat/README.md b/...es/Aquila/Aquila-sft/README_AquilaChat.md → examples/Aquila/Aquila-chat/README.md
diff --git a/examples/Aquila/Aquila-sft/aquila_sft.py → examples/Aquila/Aquila-chat/aquila_chat.py b/examples/Aquila/Aquila-sft/aquila_sft.py → examples/Aquila/Aquila-chat/aquila_chat.py
@@ -13,7 +13,7 @@
 from flagai.env_trainer_v1 import EnvTrainer
 import jsonlines
 import numpy as np
-from examples.Aquila import cyg_conversation as conversation_lib
+import cyg_conversation as conversation_lib
 from flagai.model.tools.lora.prepare_lora import lora_transfer
 device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
 

diff --git a/examples/Aquila/Aquila-sft/bmtrain_mgpu.sh → examples/Aquila/Aquila-chat/bmtrain_mgpu.sh b/examples/Aquila/Aquila-sft/bmtrain_mgpu.sh → examples/Aquila/Aquila-chat/bmtrain_mgpu.sh
diff --git a/examples/Aquila/cyg_conversation.py → ...es/Aquila/Aquila-chat/cyg_conversation.py b/examples/Aquila/cyg_conversation.py → ...es/Aquila/Aquila-chat/cyg_conversation.py
diff --git a/.../Aquila/Aquila-sft/data/sft_samples.jsonl → ...Aquila/Aquila-chat/data/sft_samples.jsonl b/.../Aquila/Aquila-sft/data/sft_samples.jsonl → ...Aquila/Aquila-chat/data/sft_samples.jsonl
diff --git a/.../Aquila/Aquila-sft/dist_trigger_docker.sh → ...Aquila/Aquila-chat/dist_trigger_docker.sh b/.../Aquila/Aquila-sft/dist_trigger_docker.sh → ...Aquila/Aquila-chat/dist_trigger_docker.sh
diff --git a/examples/Aquila/Aquila-sft/generate_sft.py → examples/Aquila/Aquila-chat/generate_chat.py b/examples/Aquila/Aquila-sft/generate_sft.py → examples/Aquila/Aquila-chat/generate_chat.py
@@ -92,7 +92,7 @@ def convo_tokenize(convo_obj, tokenizer):
     print('-'*80)
     print(f"text is {text}")
 
-    from examples.Aquila.cyg_conversation import default_conversation
+    from cyg_conversation import default_conversation
 
     conv = default_conversation.copy()
     conv.append_message(conv.roles[0], text)

diff --git a/examples/Aquila/Aquila-chat/generate_chat_bminf.py b/examples/Aquila/Aquila-chat/generate_chat_bminf.py
@@ -0,0 +1,109 @@
+import os
+import torch
+from flagai.auto_model.auto_loader import AutoLoader
+from flagai.model.predictor.predictor import Predictor
+from flagai.model.predictor.aquila import aquila_generate
+from flagai.data.tokenizer import Tokenizer
+import bminf
+
+state_dict = "/data2/yzd/checkpoints/converted_models_ldwang"
+model_name = 'aquilachat-7b'
+
+loader = AutoLoader(
+    "lm",
+    model_dir=state_dict,
+    model_name=model_name,
+    use_cache=True)
+model = loader.get_model()
+tokenizer = loader.get_tokenizer()
+cache_dir = os.path.join(state_dict, model_name)
+
+model.eval()
+model.half()
+
+with torch.cuda.device(0):
+    model = bminf.wrapper(model, quantization=False, memory_limit=2 << 30)
+
+predictor = Predictor(model, tokenizer)
+
+texts = [
+        "北京为什么是中国的首都？",
+        "1+1=",
+        "为什么湘菜那么甜？",
+        "东三省和海南岛的区别？",
+        ]
+## 
+def pack_obj(text):
+    obj = dict()
+    obj['id'] = 'demo'
+
+    obj['conversations'] = []
+    human = dict()
+    human['from'] = 'human'
+    human['value'] = text
+    obj['conversations'].append(human)
+    # dummy bot
+    bot = dict()
+    bot['from'] = 'gpt'
+    bot['value'] = ''
+    obj['conversations'].append(bot)
+
+    obj['instruction'] = ''
+
+    return obj
+
+def delete_last_bot_end_singal(convo_obj):
+    conversations = convo_obj['conversations']
+    assert len(conversations) > 0 and len(conversations) % 2 == 0
+    assert conversations[0]['from'] == 'human'
+
+    last_bot = conversations[len(conversations)-1]
+    assert last_bot['from'] == 'gpt'
+
+    ## from _add_speaker_and_signal
+    END_SIGNAL = "\n"
+    len_end_singal = len(END_SIGNAL)
+    len_last_bot_value = len(last_bot['value'])
+    last_bot['value'] = last_bot['value'][:len_last_bot_value-len_end_singal]
+    return
+
+def convo_tokenize(convo_obj, tokenizer):
+    chat_desc = convo_obj['chat_desc']
+    instruction = convo_obj['instruction']
+    conversations = convo_obj['conversations']
+
+    # chat_desc
+    example = tokenizer.encode_plus(f"{chat_desc}", None, max_length=None)['input_ids']
+    EOS_TOKEN = example[-1]
+    example = example[:-1] # remove eos
+    # instruction
+    instruction = tokenizer.encode_plus(f"{instruction}", None, max_length=None)['input_ids']
+    instruction = instruction[1:-1] # remove bos & eos
+    example += instruction
+
+    for conversation in conversations:
+        role = conversation['from']
+        content = conversation['value']
+        print(f"role {role}, raw content {content}")
+        content = tokenizer.encode_plus(f"{content}", None, max_length=None)['input_ids']
+        content = content[1:-1] # remove bos & eos
+        print(f"role {role}, content {content}")
+        example += content
+    return example
+
+for text in texts:
+    print('-'*80)
+    print(f"text is {text}")
+
+    from cyg_conversation import default_conversation
+
+    conv = default_conversation.copy()
+    conv.append_message(conv.roles[0], text)
+    conv.append_message(conv.roles[1], None)
+
+    tokens = tokenizer.encode_plus(f"{conv.get_prompt()}", None, max_length=None)['input_ids']
+    tokens = tokens[1:-1]
+
+    with torch.no_grad():
+        out = aquila_generate(tokenizer, model, [text], max_gen_len:=200, top_p=0.95, prompts_tokens=[tokens])
+        print(f"pred is {out}")
diff --git a/examples/Aquila/Aquila-sft/hostfile → examples/Aquila/Aquila-chat/hostfile b/examples/Aquila/Aquila-sft/hostfile → examples/Aquila/Aquila-chat/hostfile
diff --git a/...s/Aquila/Aquila-code/README_AquilaCode.md → examples/Aquila/Aquila-code/README.md b/...s/Aquila/Aquila-code/README_AquilaCode.md → examples/Aquila/Aquila-code/README.md
@@ -148,7 +148,7 @@ bash dist_trigger_docker.sh hostfile Aquila-sft.yaml [aquilacode-7b-nv/aquilacod
 
 ## 证书/License
 
-AquilaCode-7B-NV开源模型使用 [智源Aquila系列模型许可协议](https://huggingface.co/BAAI/AquilaCode-7B-NV/resolve/main/BAAI%20Aquila%20Model%20License%20Agreement.pdf), 原始代码基于[Apache Licence 2.0](https://www.apache.org/licenses/LICENSE-2.0)。
+AquilaCode-7B-NV和AquilaCode-7B-TS开源模型使用 [智源Aquila系列模型许可协议](https://huggingface.co/BAAI/AquilaCode-7B-NV/resolve/main/BAAI%20Aquila%20Model%20License%20Agreement.pdf), 原始代码基于[Apache Licence 2.0](https://www.apache.org/licenses/LICENSE-2.0)。
 
 
-AquilaCode-7B-NV open-source model is licensed under [ BAAI Aquila Model Licence Agreement](https://huggingface.co/BAAI/AquilaCode-7B-NV/resolve/main/BAAI%20Aquila%20Model%20License%20Agreement.pdf). The source code is under [Apache Licence 2.0](https://www.apache.org/licenses/LICENSE-2.0).
+AquilaCode-7B-NV and AquilaCode-7B-TSopen-source model is licensed under [ BAAI Aquila Model Licence Agreement](https://huggingface.co/BAAI/AquilaCode-7B-NV/resolve/main/BAAI%20Aquila%20Model%20License%20Agreement.pdf). The source code is under [Apache Licence 2.0](https://www.apache.org/licenses/LICENSE-2.0).
diff --git a/examples/Aquila/Aquila-code/aquila_code.py b/examples/Aquila/Aquila-code/aquila_code.py