自定义数据集训练KIE的RE模型的时候，提示ValueError: (InvalidArgument) x dim number should greater than 0, but received value is: 0 [Hint: Expected x_dim > 0, but received x_dim:0 <= 0:0.] (at ../paddle/phi/backends/gpu/gpu_launch_config.h:180) #13632

freezehe · 2024-08-10T06:07:54Z

Search before asking

I have searched the PaddleOCR Docs and found no similar bug report.
I have searched the PaddleOCR Issues and found no similar bug report.
I have searched the PaddleOCR Discussions and found no similar bug report.

Bug

你好，我在训练自定义数据集，先贴一下我一行的标注文件内容：zh_train_0.jpg [{"transcription":"委托方名称","points":[[225,1119],[528,1119],[528,1181],[225,1181]],"id":1,"label":"wtfmc_key","linking":[[1,2]]},{"transcription":"上海蝶叶电线电缆有限公司","points":[[1130,1119],[1732,1119],[1732,1181],[1130,1181]],"id":2,"label":"wtfmc_value","linking":[[1,2]]},{"transcription":"委托方地址","points":[[225,1254],[524,1254],[524,1316],[225,1316]],"id":3,"label":"wtfdz_key","linking":[[3,4]]},{"transcription":"嘉定区银龙路258弄14号12幢3层","points":[[1041,1287],[1839,1287],[1839,1338],[1041,1338]],"id":4,"label":"wtfdz_value","linking":[[3,4]]},{"transcription":"委托单编号","points":[[225,1386],[517,1386],[517,1448],[225,1448]],"id":5,"label":"wtdbh_key","linking":[[5,6]]},{"transcription":"2020-8047","points":[[1311,1422],[1540,1422],[1540,1473],[1311,1473]],"id":6,"label":"wtdbh_value","linking":[[5,6]]},{"transcription":"样品名称","points":[[222,1517],[461,1517],[461,1580],[222,1580]],"id":7,"label":"ypmc_key","linking":[[7,8]]},{"transcription":"电子天平","points":[[1325,1547],[1536,1547],[1536,1612],[1325,1612]],"id":8,"label":"ypmc_value","linking":[[7,8]]},{"transcription":"型号/规格","points":[[222,1649],[476,1649],[476,1711],[222,1711]],"id":9,"label":"xhgg_key","linking":[[9,10]]},{"transcription":"ES461","points":[[1355,1682],[1514,1682],[1514,1737],[1355,1737]],"id":10,"label":"xhgg_value","linking":[[9,10]]},{"transcription":"制造厂","points":[[225,1781],[395,1781],[395,1835],[225,1835]],"id":11,"label":"zzc_key","linking":[[11,12]]},{"transcription":"HC","points":[[1384,1814],[1469,1814],[1469,1872],[1384,1872]],"id":12,"label":"zzc_value","linking":[[11,12]]},{"transcription":"样品编号","points":[[225,1909],[461,1909],[461,1971],[225,1971]],"id":13,"label":"ypbh_key","linking":[[13,14]]},{"transcription":"/","points":[[1404,1950],[1435,1943],[1446,1987],[1416,1995]],"id":14,"label":"ypbh_value","linking":[[13,14]]},{"transcription":"委托日期","points":[[223,2032],[466,2041],[464,2107],[221,2098]],"id":15,"label":"wtrq_key","linking":[[15,16]]},{"transcription":"2020年08月24日","points":[[1204,2077],[1636,2077],[1636,2128],[1204,2128]],"id":16,"label":"wtrq_value","linking":[[15,16]]}]，
class_list_xfun.txt 内容如下：
WTFMC_KEY
WTFMC_VALUE
WTFDZ_KEY
WTFDZ_VALUE
WTDBH_KEY
WTDBH_VALUE
YPMC_KEY
YPMC_VALUE
XHGG_KEY
XHGG_VALUE
ZZC_KEY
ZZC_VALUE
YPBH_KEY
YPBH_VALUE
WTRQ_KEY
WTRQ_VALUE
我修改了/home/aistudio/PaddleOCR/configs/kie/vi_layoutxlm/re_vi_layoutxlm_xfund_zh.yml 这个配置文件

VQAReTokenChunk:
max_seq_len: *max_seq_len
entities_labels: {"WTFMC_KEY": 1, "WTFMC_VALUE": 2, "WTFDZ_KEY":3, "WTFDZ_VALUE":4, "WTDBH_KEY":5,"WTDBH_VALUE": 6, "YPMC_KEY": 7, "YPMC_VALUE": 8, "XHGG_KEY":9, "XHGG_VALUE":10, "ZZC_KEY":11,"ZZC_VALUE": 12, "YPBH_KEY": 13, "YPBH_VALUE": 14, "WTRQ_KEY":15, "WTRQ_VALUE":16}
加了entities_labels 这个属性，报错信息如下：
[2024/08/10 14:00:47] ppocr INFO: During the training process, after the 0th iteration, an evaluation is run every 19 iterations
Traceback (most recent call last):
File "/home/aistudio/PaddleOCR/tools/train.py", line 255, in
main(config, device, logger, vdl_writer, seed)
File "/home/aistudio/PaddleOCR/tools/train.py", line 208, in main
program.train(
File "/home/aistudio/PaddleOCR/tools/program.py", line 342, in train
preds = model(batch)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1254, in call
return self.forward(*inputs, **kwargs)
File "/home/aistudio/PaddleOCR/ppocr/modeling/architectures/base_model.py", line 85, in forward
x = self.backbone(x)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1254, in call
return self.forward(*inputs, **kwargs)
File "/home/aistudio/PaddleOCR/ppocr/modeling/backbones/vqa_layoutlm.py", line 248, in forward
x = self.model(
File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1254, in call
return self.forward(*inputs, **kwargs)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/layoutxlm/modeling.py", line 1412, in forward
loss, pred_relations = self.extractor(sequence_output, entities, relations)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1254, in call
return self.forward(*inputs, **kwargs)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/layoutxlm/modeling.py", line 1304, in forward
relations, entities = self.build_relation(relations, entities)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/layoutxlm/modeling.py", line 1258, in build_relation
positive_relations = paddle.stack([relation_head, relation_tail], axis=1)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/tensor/manipulation.py", line 1842, in stack
return _C_ops.stack(x, axis)
ValueError: (InvalidArgument) x dim number should greater than 0, but received value is: 0
[Hint: Expected x_dim > 0, but received x_dim:0 <= 0:0.] (at ../paddle/phi/backends/gpu/gpu_launch_config.h:180)

请问这个是什么bug？

Environment

我是在百度studio训练的。
aiofiles==23.2.1
aiohttp==3.9.5
aiosignal==1.3.1
aistudio-sdk @ file:///home/aistudio/aistudio_sdk-0.2.4-py3-none-any.whl#sha256=d93411cc8764e465860cbf2f97f787dddd1548595d4776c97ddf0ea787dedd81
albucore==0.0.13
albumentations==1.4.10
altair==4.2.2
annotated-types==0.6.0
anyio==4.3.0
astor==0.8.1
asttokens==2.4.1
async-timeout==4.0.3
attrdict3==2.0.2
attrs==23.2.0
Babel==2.14.0
bce-python-sdk==0.9.6
beautifulsoup4==4.12.3
blinker==1.7.0
cachetools==5.3.3
certifi==2024.2.2
charset-normalizer==3.3.2
click==8.1.7
colorama==0.4.6
coloredlogs==15.0.1
colorlog==6.8.2
comm==0.2.2
contourpy==1.2.1
cycler==0.12.1
Cython==3.0.11
datasets==2.19.0
debugpy==1.8.1
decorator==5.1.1
dill==0.3.4
easydict==1.13
entrypoints==0.4
exceptiongroup==1.2.1
executing==2.0.1
fastapi==0.110.2
ffmpy==0.3.2
filelock==3.13.4
fire==0.6.0
Flask==3.0.3
Flask-Babel==2.0.0
flatbuffers==24.3.25
fonttools==4.51.0
frozenlist==1.4.1
fsspec==2024.3.1
future==1.0.0
gitdb==4.0.11
GitPython==3.1.43
gradio==3.40.0
gradio_client==0.15.1
gunicorn==22.0.0
h11==0.14.0
httpcore==1.0.5
httpx==0.27.0
huggingface-hub==0.22.2
humanfriendly==10.0
idna==3.7
imageio==2.34.2
imgaug==0.4.0
importlib_metadata==7.1.0
importlib_resources==6.4.0
ipykernel==6.29.4
ipython==8.23.0
itsdangerous==2.2.0
jedi==0.19.1
jieba==0.42.1
Jinja2==3.1.3
joblib==1.4.0
jsonschema==4.21.1
jsonschema-specifications==2023.12.1
jupyter_client==8.6.1
jupyter_core==5.7.2
kiwisolver==1.4.5
lazy_loader==0.4
linkify-it-py==2.0.3
lmdb==1.5.1
lxml==5.2.2
markdown-it-py==2.2.0
MarkupSafe==2.1.5
matplotlib==3.8.4
matplotlib-inline==0.1.7
mdit-py-plugins==0.3.3
mdurl==0.1.1
mpmath==1.3.0
multidict==6.0.5
multiprocess==0.70.12.2
nest-asyncio==1.6.0
networkx==3.3
numpy==1.26.4
onnx==1.16.0
onnxruntime==1.17.3
opencv-contrib-python==4.10.0.84
opencv-python==4.9.0.80
opencv-python-headless==4.10.0.84
opt-einsum==3.3.0
orjson==3.10.1
packaging==24.0
paddle2onnx==1.2.1
paddlefsl==1.1.0
paddlehub==2.4.0
paddlenlp==2.5.2
paddleocr==2.8.1
paddlepaddle-gpu @ file:///tmp/paddlepaddle_gpu-2.5.2-cp310-cp310-linux_x86_64.whl#sha256=2b4a84c853c7c88ddf4984c667bfcb824cc8a28a674448099452f50c686cc1bb
pandas==2.2.2
parso==0.8.4
pexpect==4.9.0
pickleshare==0.7.5
pillow==10.3.0
platformdirs==4.2.0
prettytable==3.10.0
prompt-toolkit==3.0.43
protobuf==3.20.3
psutil==5.9.8
ptyprocess==0.7.0
pure-eval==0.2.2
pyarrow==16.0.0
pyarrow-hotfix==0.6
pybind11==2.12.0
pyclipper==1.3.0.post5
pycryptodome==3.20.0
pydantic==2.7.0
pydantic_core==2.18.1
pydeck==0.9.1
pydub==0.25.1
Pygments==2.17.2
Pympler==1.0.1
pypandoc==1.13
pyparsing==3.1.2
python-dateutil==2.9.0.post0
python-docx==1.1.2
python-multipart==0.0.9
pytz==2024.1
PyYAML==6.0.1
pyzmq==26.0.2
rapidfuzz==3.9.6
rarfile==4.2
referencing==0.34.0
requests==2.31.0
rich==13.7.1
rpds-py==0.18.0
ruff==0.4.1
safetensors==0.4.3
scikit-image==0.24.0
scikit-learn==1.4.2
scipy==1.13.0
semantic-version==2.10.0
semver==3.0.2
sentencepiece==0.2.0
seqeval==1.2.2
shapely==2.0.5
shellingham==1.5.4
six==1.16.0
smmap==5.0.1
sniffio==1.3.1
soupsieve==2.5
stack-data==0.6.3
starlette==0.37.2
streamlit==1.13.0
streamlit-image-comparison==0.0.4
sympy==1.12
termcolor==2.4.0
threadpoolctl==3.4.0
tifffile==2024.7.24
toml==0.10.2
tomli==2.0.1
tomlkit==0.12.0
tool-helpers==0.1.1
toolz==0.12.1
tornado==6.4
tqdm==4.66.2
traitlets==5.14.3
typer==0.12.3
typing_extensions==4.11.0
tzdata==2024.1
tzlocal==5.2
uc-micro-py==1.0.3
urllib3==2.2.1
uvicorn==0.29.0
validators==0.28.3
visualdl==2.4.2
watchdog==4.0.1
wcwidth==0.2.13
websockets==11.0.3
Werkzeug==3.0.2
xxhash==3.4.1
yacs==0.1.8
yarl==1.9.4
zipp==3.19.2

Minimal Reproducible Example

re的配置文件
Global:
use_gpu: True
epoch_num: &epoch_num 130
log_smooth_window: 10
print_batch_step: 10
save_model_dir: ./output/ccic/re_vi_layoutxlm_xfund_zh
save_epoch_step: 2000

evaluation is run every 10 iterations after the 0th iteration

eval_batch_step: [ 0, 19 ]
cal_metric_during_train: False
save_inference_dir:
use_visualdl: False
seed: 2022
infer_img: ppstructure/docs/kie/input/zh_val_21.jpg
save_res_path: ./output/ccic/re/xfund_zh/with_gt
kie_rec_model_dir:
kie_det_model_dir:

Architecture:
model_type: kie
algorithm: &algorithm "LayoutXLM"
Transform:
Backbone:
name: LayoutXLMForRe
pretrained: True
mode: vi
checkpoints:

Loss:
name: LossFromOutput
key: loss
reduction: mean

Optimizer:
name: AdamW
beta1: 0.9
beta2: 0.999
clip_norm: 10
lr:
learning_rate: 0.00005
warmup_epoch: 10
regularizer:
name: L2
factor: 0.00000

PostProcess:
name: VQAReTokenLayoutLMPostProcess

Metric:
name: VQAReTokenMetric
main_indicator: hmean

Train:
dataset:
name: SimpleDataSet
data_dir: train_data/0810_8020/zh_train/image
label_file_list:
- train_data/0810_8020/zh_train/train.json
ratio_list: [ 1.0 ]
transforms:
- DecodeImage: # load image
img_mode: RGB
channel_first: False
- VQATokenLabelEncode: # Class handling label
contains_re: True
algorithm: *algorithm
class_path: &class_path /home/aistudio/PaddleOCR/train_data/0810_8020/class_list_xfun.txt
#class_path: /home/aistudio/PaddleOCR/train_data/0810_8020/class_list_xfun.txt
use_textline_bbox_info: &use_textline_bbox_info True
order_method: &order_method "tb-yx"
- VQATokenPad:
max_seq_len: &max_seq_len 512
return_attention_mask: True
- VQAReTokenRelation:
- VQAReTokenChunk:
max_seq_len: *max_seq_len
entities_labels: {"WTFMC_KEY": 1, "WTFMC_VALUE": 2, "WTFDZ_KEY":3, "WTFDZ_VALUE":4, "WTDBH_KEY":5,"WTDBH_VALUE": 6, "YPMC_KEY": 7, "YPMC_VALUE": 8, "XHGG_KEY":9, "XHGG_VALUE":10, "ZZC_KEY":11,"ZZC_VALUE": 12, "YPBH_KEY": 13, "YPBH_VALUE": 14, "WTRQ_KEY":15, "WTRQ_VALUE":16}
- TensorizeEntitiesRelations:
- Resize:
size: [224,224]
- NormalizeImage:
scale: 1
mean: [ 123.675, 116.28, 103.53 ]
std: [ 58.395, 57.12, 57.375 ]
order: 'hwc'
- ToCHWImage:
- KeepKeys:
keep_keys: [ 'input_ids', 'bbox','attention_mask', 'token_type_ids', 'entities', 'relations'] # dataloader will return list in this order
loader:
shuffle: True
drop_last: False
batch_size_per_card: 2
num_workers: 4

Eval:
dataset:
name: SimpleDataSet
data_dir: train_data/0810_8020/zh_val/image
label_file_list:
- train_data/0810_8020/zh_val/val.json
transforms:
- DecodeImage: # load image
img_mode: RGB
channel_first: False
- VQATokenLabelEncode: # Class handling label
contains_re: True
algorithm: *algorithm
class_path: *class_path
use_textline_bbox_info: *use_textline_bbox_info
order_method: *order_method
- VQATokenPad:
max_seq_len: *max_seq_len
return_attention_mask: True
- VQAReTokenRelation:
- VQAReTokenChunk:
max_seq_len: *max_seq_len
- TensorizeEntitiesRelations:
- Resize:
size: [224,224]
- NormalizeImage:
scale: 1
mean: [ 123.675, 116.28, 103.53 ]
std: [ 58.395, 57.12, 57.375 ]
order: 'hwc'
- ToCHWImage:
- KeepKeys:
keep_keys: [ 'input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'entities', 'relations'] # dataloader will return list in this order
loader:
shuffle: False
drop_last: False
batch_size_per_card: 8
num_workers: 8

Additional

No response

Are you willing to submit a PR?

Yes I'd like to help by submitting a PR!

freezehe added the bug Something isn't working label Aug 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

自定义数据集训练KIE的RE模型的时候，提示ValueError: (InvalidArgument) x dim number should greater than 0, but received value is: 0 [Hint: Expected x_dim > 0, but received x_dim:0 <= 0:0.] (at ../paddle/phi/backends/gpu/gpu_launch_config.h:180) #13632

自定义数据集训练KIE的RE模型的时候，提示ValueError: (InvalidArgument) x dim number should greater than 0, but received value is: 0 [Hint: Expected x_dim > 0, but received x_dim:0 <= 0:0.] (at ../paddle/phi/backends/gpu/gpu_launch_config.h:180) #13632

freezehe commented Aug 10, 2024 •

edited

Loading

自定义数据集训练KIE的RE模型的时候，提示ValueError: (InvalidArgument) x dim number should greater than 0, but received value is: 0 [Hint: Expected x_dim > 0, but received x_dim:0 <= 0:0.] (at ../paddle/phi/backends/gpu/gpu_launch_config.h:180) #13632

自定义数据集训练KIE的RE模型的时候，提示ValueError: (InvalidArgument) x dim number should greater than 0, but received value is: 0 [Hint: Expected x_dim > 0, but received x_dim:0 <= 0:0.] (at ../paddle/phi/backends/gpu/gpu_launch_config.h:180) #13632

Comments

freezehe commented Aug 10, 2024 • edited Loading

Search before asking

Bug

Environment

Minimal Reproducible Example

evaluation is run every 10 iterations after the 0th iteration

Additional

Are you willing to submit a PR?

freezehe commented Aug 10, 2024 •

edited

Loading