Skip to content

Commit 2ffcf6a

Browse files
committed
add roformer-v2
1 parent 88bc606 commit 2ffcf6a

9 files changed

+254
-77
lines changed

README.md

Lines changed: 72 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,26 @@
1-
# PyTorch RoFormer
2-
原版Tensorflow权重(https://github.com/ZhuiyiTechnology/roformer)
3-
- [chinese_roformer_L-12_H-768_A-12.zip](https://pan.baidu.com/s/1fiss862YsGCwf2HvU_Jm-g) (提取码:xy9x)
4-
- [chinese_roformer_L-6_H-384_A-6.zip](https://pan.baidu.com/s/1iIXgZHHCgrYGXVRRSSCVPg) (提取码:gy97)
5-
- [chinese_roformer-char_L-12_H-768_A-12.zip](https://pan.baidu.com/s/1Q1pq8F4Fsl6bTipUAkqeDQ) (提取码:bt94)
6-
- [chinese_roformer-char_L-6_H-384_A-6.zip](https://pan.baidu.com/s/1cc281-M0Rsjlwws5phqzbQ)(提取码:a44c)
7-
- [chinese_roformer-sim-char_L-12_H-768_A-12.zip](https://pan.baidu.com/s/1f1FB288nv1a6jYjsNCordg)(提取码:2cgz)
8-
- [chinese_roformer-sim-char_L-6_H-384_A-6.zip](https://pan.baidu.com/s/1r0eJ7shGwQ0RzV9BTFFW4g)(提取码:h68q)
9-
10-
已经转化为PyTorch权重
11-
- [chinese_roformer_small.zip](https://pan.baidu.com/s/1Cx7lhtojTyRF61IKHWXEHw) (提取码:8znw)
12-
- [chinese_roformer_base.zip](https://pan.baidu.com/s/10W5BYDQSeLyajTWjexZeoQ) (提取码:bimr)
13-
- [chinese_roformer_char_base.zip](https://pan.baidu.com/s/18bgJ1t_1ke0BXq_Xg02qSQ) (提取码:oqb5)
14-
15-
## 安装(代码已经加入到huggingface仓库)
1+
# PyTorch RoFormer & RoFormer-V2
2+
RoFormer模型和RoFormer-V2模型
3+
4+
## 更新
5+
- 2022/03/21 添加`roformer-v2`的权重, 注:必须使用本仓库的代码,不能使用transformers仓库的代码!!!
6+
7+
## 安装(代码已经加入到huggingface仓库),V2版本需要使用本仓库的代码
168
transformers v4.7版本已经发布,可以直接安装使用
179
```bash
1810
pip install -U transformers
1911
```
12+
2013
## 模型权重对照表
2114

22-
### 中文模型
15+
### 中文模型 roformer-v2
16+
| huggingface.co | bert4keras |
17+
| ---------------------------------- | ------------------------------------------------ |
18+
| [roformer_v2_chinese_char_small](https://huggingface.co/junnyu/roformer_v2_chinese_char_small) | [chinese_roformer-v2-char_L-6_H-384_A-6.zip](https://pan.baidu.com/s/1huUrC9P60Afggo8AfiUcmA) (download code:ttn4) |
19+
| [roformer_v2_chinese_char_base](https://huggingface.co/junnyu/roformer_v2_chinese_char_base) | [chinese_roformer-v2-char_L-12_H-768_A-12.zip](https://pan.baidu.com/s/1qcnN4LVKVe0-mnHlkN3-6Q) (download code:pfoh) |
20+
| [roformer_v2_chinese_char_large](https://huggingface.co/junnyu/roformer_v2_chinese_char_large) | [chinese_roformer-v2-char_L-24_H-1024_A-16.zip](https://pan.baidu.com/s/1QiJWSZrGxn8vek-8myvL6w) (download code:npfv) |
21+
22+
23+
### 中文模型 roformer-v1
2324
| huggingface.co | bert4keras |
2425
| ---------------------------------- | ------------------------------------------------ |
2526
| [roformer_chinese_base](https://huggingface.co/junnyu/roformer_chinese_base) | [chinese_roformer_L-12_H-768_A-12.zip](https://pan.baidu.com/s/1fiss862YsGCwf2HvU_Jm-g) (download code:xy9x) |
@@ -38,34 +39,69 @@ pip install -U transformers
3839
|[roformer_small_generator](https://huggingface.co/junnyu/roformer_small_generator)|
3940
|[roformer_small_discriminator](https://huggingface.co/junnyu/roformer_small_discriminator)|
4041

41-
42-
## 使用
42+
## roformer-v2 MLM测试
4343
```python
4444
import torch
45-
from transformers import RoFormerModel, RoFormerTokenizer, TFRoFormerModel
46-
tokenizer = RoFormerTokenizer.from_pretrained("junnyu/roformer_chinese_base")
47-
pt_model = RoFormerModel.from_pretrained("junnyu/roformer_chinese_base")
48-
tf_model = TFRoFormerModel.from_pretrained("junnyu/roformer_chinese_base",
49-
from_pt=True)
50-
text = "这里基本保留了唐宋遗留下来的坊巷格局和大量明清古建筑,其中各级文保单位29处,被誉为“里坊制度的活化石”“明清建筑博物馆”!"
45+
import tensorflow as tf
46+
from transformers import BertTokenizer
47+
from roformer import RoFormerForMaskedLM, TFRoFormerForMaskedLM
48+
49+
text = "今天[MASK]很好,我[MASK]去公园玩。"
50+
tokenizer = BertTokenizer.from_pretrained("junnyu/roformer_v2_chinese_char_base")
51+
pt_model = RoFormerForMaskedLM.from_pretrained("junnyu/roformer_v2_chinese_char_base")
52+
tf_model = TFRoFormerForMaskedLM.from_pretrained(
53+
"junnyu/roformer_v2_chinese_char_base", from_pt=True
54+
)
5155
pt_inputs = tokenizer(text, return_tensors="pt")
5256
tf_inputs = tokenizer(text, return_tensors="tf")
57+
# pytorch
5358
with torch.no_grad():
54-
pt_outputs = pt_model(**pt_inputs).last_hidden_state
55-
print(pt_outputs.shape)
56-
tf_outputs = tf_model(**tf_inputs, training=False).last_hidden_state
57-
print(tf_outputs.shape)
59+
pt_outputs = pt_model(**pt_inputs).logits[0]
60+
pt_outputs_sentence = "pytorch: "
61+
for i, id in enumerate(tokenizer.encode(text)):
62+
if id == tokenizer.mask_token_id:
63+
tokens = tokenizer.convert_ids_to_tokens(pt_outputs[i].topk(k=5)[1])
64+
pt_outputs_sentence += "[" + "||".join(tokens) + "]"
65+
else:
66+
pt_outputs_sentence += "".join(
67+
tokenizer.convert_ids_to_tokens([id], skip_special_tokens=True)
68+
)
69+
print(pt_outputs_sentence)
70+
# tf
71+
tf_outputs = tf_model(**tf_inputs, training=False).logits[0]
72+
tf_outputs_sentence = "tf: "
73+
for i, id in enumerate(tokenizer.encode(text)):
74+
if id == tokenizer.mask_token_id:
75+
tokens = tokenizer.convert_ids_to_tokens(tf.math.top_k(tf_outputs[i], k=5)[1])
76+
tf_outputs_sentence += "[" + "||".join(tokens) + "]"
77+
else:
78+
tf_outputs_sentence += "".join(
79+
tokenizer.convert_ids_to_tokens([id], skip_special_tokens=True)
80+
)
81+
print(tf_outputs_sentence)
82+
# small
83+
# pytorch: 今天[的||,||是||很||也]很好,我[要||会||是||想||在]去公园玩。
84+
# tf: 今天[的||,||是||很||也]很好,我[要||会||是||想||在]去公园玩。
85+
# base
86+
# pytorch: 今天[我||天||晴||园||玩]很好,我[想||要||会||就||带]去公园玩。
87+
# tf: 今天[我||天||晴||园||玩]很好,我[想||要||会||就||带]去公园玩。
88+
# large
89+
# pytorch: 今天[天||气||我||空||阳]很好,我[又||想||会||就||爱]去公园玩。
90+
# tf: 今天[天||气||我||空||阳]很好,我[又||想||会||就||爱]去公园玩。
5891
```
59-
## MLM测试
92+
93+
## roformer-v1 MLM测试
6094
```python
6195
import torch
6296
import tensorflow as tf
6397
from transformers import RoFormerForMaskedLM, RoFormerTokenizer, TFRoFormerForMaskedLM
98+
6499
text = "今天[MASK]很好,我[MASK]去公园玩。"
65100
tokenizer = RoFormerTokenizer.from_pretrained("junnyu/roformer_chinese_base")
66101
pt_model = RoFormerForMaskedLM.from_pretrained("junnyu/roformer_chinese_base")
67102
tf_model = TFRoFormerForMaskedLM.from_pretrained(
68-
"junnyu/roformer_chinese_base", from_pt=True)
103+
"junnyu/roformer_chinese_base", from_pt=True
104+
)
69105
pt_inputs = tokenizer(text, return_tensors="pt")
70106
tf_inputs = tokenizer(text, return_tensors="tf")
71107
# pytorch
@@ -78,22 +114,24 @@ for i, id in enumerate(tokenizer.encode(text)):
78114
pt_outputs_sentence += "[" + "||".join(tokens) + "]"
79115
else:
80116
pt_outputs_sentence += "".join(
81-
tokenizer.convert_ids_to_tokens([id], skip_special_tokens=True))
117+
tokenizer.convert_ids_to_tokens([id], skip_special_tokens=True)
118+
)
82119
print(pt_outputs_sentence)
83120
# tf
84121
tf_outputs = tf_model(**tf_inputs, training=False).logits[0]
85122
tf_outputs_sentence = "tf: "
86123
for i, id in enumerate(tokenizer.encode(text)):
87124
if id == tokenizer.mask_token_id:
88-
tokens = tokenizer.convert_ids_to_tokens(
89-
tf.math.top_k(tf_outputs[i], k=5)[1])
125+
tokens = tokenizer.convert_ids_to_tokens(tf.math.top_k(tf_outputs[i], k=5)[1])
90126
tf_outputs_sentence += "[" + "||".join(tokens) + "]"
91127
else:
92128
tf_outputs_sentence += "".join(
93-
tokenizer.convert_ids_to_tokens([id], skip_special_tokens=True))
129+
tokenizer.convert_ids_to_tokens([id], skip_special_tokens=True)
130+
)
94131
print(tf_outputs_sentence)
95132
# pytorch: 今天[天气||天||心情||阳光||空气]很好,我[想||要||打算||准备||喜欢]去公园玩。
96133
# tf: 今天[天气||天||心情||阳光||空气]很好,我[想||要||打算||准备||喜欢]去公园玩。
134+
97135
```
98136

99137
## 手动权重转换

examples/test_mlm.py

Lines changed: 0 additions & 23 deletions
This file was deleted.

examples/test_mlm_v1.py

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
import torch
2+
import tensorflow as tf
3+
from transformers import RoFormerForMaskedLM, RoFormerTokenizer, TFRoFormerForMaskedLM
4+
5+
text = "今天[MASK]很好,我[MASK]去公园玩。"
6+
tokenizer = RoFormerTokenizer.from_pretrained("junnyu/roformer_chinese_base")
7+
pt_model = RoFormerForMaskedLM.from_pretrained("junnyu/roformer_chinese_base")
8+
tf_model = TFRoFormerForMaskedLM.from_pretrained(
9+
"junnyu/roformer_chinese_base", from_pt=True
10+
)
11+
pt_inputs = tokenizer(text, return_tensors="pt")
12+
tf_inputs = tokenizer(text, return_tensors="tf")
13+
# pytorch
14+
with torch.no_grad():
15+
pt_outputs = pt_model(**pt_inputs).logits[0]
16+
pt_outputs_sentence = "pytorch: "
17+
for i, id in enumerate(tokenizer.encode(text)):
18+
if id == tokenizer.mask_token_id:
19+
tokens = tokenizer.convert_ids_to_tokens(pt_outputs[i].topk(k=5)[1])
20+
pt_outputs_sentence += "[" + "||".join(tokens) + "]"
21+
else:
22+
pt_outputs_sentence += "".join(
23+
tokenizer.convert_ids_to_tokens([id], skip_special_tokens=True)
24+
)
25+
print(pt_outputs_sentence)
26+
# tf
27+
tf_outputs = tf_model(**tf_inputs, training=False).logits[0]
28+
tf_outputs_sentence = "tf: "
29+
for i, id in enumerate(tokenizer.encode(text)):
30+
if id == tokenizer.mask_token_id:
31+
tokens = tokenizer.convert_ids_to_tokens(tf.math.top_k(tf_outputs[i], k=5)[1])
32+
tf_outputs_sentence += "[" + "||".join(tokens) + "]"
33+
else:
34+
tf_outputs_sentence += "".join(
35+
tokenizer.convert_ids_to_tokens([id], skip_special_tokens=True)
36+
)
37+
print(tf_outputs_sentence)
38+
# pytorch: 今天[天气||天||心情||阳光||空气]很好,我[想||要||打算||准备||喜欢]去公园玩。
39+
# tf: 今天[天气||天||心情||阳光||空气]很好,我[想||要||打算||准备||喜欢]去公园玩。

examples/test_mlm_v2.py

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
import torch
2+
import tensorflow as tf
3+
from transformers import BertTokenizer
4+
from roformer import RoFormerForMaskedLM, TFRoFormerForMaskedLM
5+
6+
text = "今天[MASK]很好,我[MASK]去公园玩。"
7+
tokenizer = BertTokenizer.from_pretrained("junnyu/roformer_v2_chinese_char_base")
8+
pt_model = RoFormerForMaskedLM.from_pretrained("junnyu/roformer_v2_chinese_char_base")
9+
tf_model = TFRoFormerForMaskedLM.from_pretrained(
10+
"junnyu/roformer_v2_chinese_char_base", from_pt=True
11+
)
12+
pt_inputs = tokenizer(text, return_tensors="pt")
13+
tf_inputs = tokenizer(text, return_tensors="tf")
14+
# pytorch
15+
with torch.no_grad():
16+
pt_outputs = pt_model(**pt_inputs).logits[0]
17+
pt_outputs_sentence = "pytorch: "
18+
for i, id in enumerate(tokenizer.encode(text)):
19+
if id == tokenizer.mask_token_id:
20+
tokens = tokenizer.convert_ids_to_tokens(pt_outputs[i].topk(k=5)[1])
21+
pt_outputs_sentence += "[" + "||".join(tokens) + "]"
22+
else:
23+
pt_outputs_sentence += "".join(
24+
tokenizer.convert_ids_to_tokens([id], skip_special_tokens=True)
25+
)
26+
print(pt_outputs_sentence)
27+
# tf
28+
tf_outputs = tf_model(**tf_inputs, training=False).logits[0]
29+
tf_outputs_sentence = "tf: "
30+
for i, id in enumerate(tokenizer.encode(text)):
31+
if id == tokenizer.mask_token_id:
32+
tokens = tokenizer.convert_ids_to_tokens(tf.math.top_k(tf_outputs[i], k=5)[1])
33+
tf_outputs_sentence += "[" + "||".join(tokens) + "]"
34+
else:
35+
tf_outputs_sentence += "".join(
36+
tokenizer.convert_ids_to_tokens([id], skip_special_tokens=True)
37+
)
38+
print(tf_outputs_sentence)
39+
# small
40+
# pytorch: 今天[的||,||是||很||也]很好,我[要||会||是||想||在]去公园玩。
41+
# tf: 今天[的||,||是||很||也]很好,我[要||会||是||想||在]去公园玩。
42+
# base
43+
# pytorch: 今天[我||天||晴||园||玩]很好,我[想||要||会||就||带]去公园玩。
44+
# tf: 今天[我||天||晴||园||玩]很好,我[想||要||会||就||带]去公园玩。
45+
# large
46+
# pytorch: 今天[天||气||我||空||阳]很好,我[又||想||会||就||爱]去公园玩。
47+
# tf: 今天[天||气||我||空||阳]很好,我[又||想||会||就||爱]去公园玩。

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
name="roformer",
55
package_dir={"": "src"},
66
packages=find_packages("src"),
7-
version="0.3.1",
7+
version="0.4.0",
88
license="Apache 2.0",
99
description="roformer_pytorch",
1010
author="Jun Yu",

src/roformer/configuration_roformer.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -105,6 +105,8 @@ def __init__(
105105
pad_token_id=0,
106106
rotary_value=False,
107107
use_cache=True,
108+
use_bias=True,
109+
norm_type="layer_norm",
108110
**kwargs
109111
):
110112
super().__init__(pad_token_id=pad_token_id, **kwargs)
@@ -124,3 +126,5 @@ def __init__(
124126
self.layer_norm_eps = layer_norm_eps
125127
self.rotary_value = rotary_value
126128
self.use_cache = use_cache
129+
self.use_bias = use_bias
130+
self.norm_type = norm_type

src/roformer/convert_roformer_original_tf_checkpoint_to_pytorch.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,10 +35,17 @@ def convert_tf_checkpoint_to_pytorch(
3535
# Load weights from tf checkpoint
3636
load_tf_weights_in_roformer(model, config, tf_checkpoint_path)
3737

38+
# ignore 不保存roformer.encoder.embed_positions.weight
39+
_keys_to_ignore_on_save = ["roformer.encoder.embed_positions.weight"]
40+
state_dict = model.state_dict()
41+
for ignore_key in _keys_to_ignore_on_save:
42+
if ignore_key in state_dict.keys():
43+
del state_dict[ignore_key]
44+
3845
# Save pytorch-model
3946
print(f"Save PyTorch model to {pytorch_dump_path}")
4047
torch.save(
41-
model.state_dict(), pytorch_dump_path, _use_new_zipfile_serialization=False
48+
state_dict, pytorch_dump_path, _use_new_zipfile_serialization=False
4249
)
4350

4451

0 commit comments

Comments
 (0)