Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

自定义数据集训练KIE的RE模型的时候,提示ValueError: (InvalidArgument) x dim number should greater than 0, but received value is: 0 [Hint: Expected x_dim > 0, but received x_dim:0 <= 0:0.] (at ../paddle/phi/backends/gpu/gpu_launch_config.h:180) #13632

Open
3 of 4 tasks
freezehe opened this issue Aug 10, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@freezehe
Copy link

freezehe commented Aug 10, 2024

Search before asking

  • I have searched the PaddleOCR Docs and found no similar bug report.

  • I have searched the PaddleOCR Issues and found no similar bug report.

  • I have searched the PaddleOCR Discussions and found no similar bug report.

Bug

你好,我在训练自定义数据集,先贴一下我一行的标注文件内容:zh_train_0.jpg [{"transcription":"委托方名称","points":[[225,1119],[528,1119],[528,1181],[225,1181]],"id":1,"label":"wtfmc_key","linking":[[1,2]]},{"transcription":"上海蝶叶电线电缆有限公司","points":[[1130,1119],[1732,1119],[1732,1181],[1130,1181]],"id":2,"label":"wtfmc_value","linking":[[1,2]]},{"transcription":"委托方地址","points":[[225,1254],[524,1254],[524,1316],[225,1316]],"id":3,"label":"wtfdz_key","linking":[[3,4]]},{"transcription":"嘉定区银龙路258弄14号12幢3层","points":[[1041,1287],[1839,1287],[1839,1338],[1041,1338]],"id":4,"label":"wtfdz_value","linking":[[3,4]]},{"transcription":"委托单编号","points":[[225,1386],[517,1386],[517,1448],[225,1448]],"id":5,"label":"wtdbh_key","linking":[[5,6]]},{"transcription":"2020-8047","points":[[1311,1422],[1540,1422],[1540,1473],[1311,1473]],"id":6,"label":"wtdbh_value","linking":[[5,6]]},{"transcription":"样品名称","points":[[222,1517],[461,1517],[461,1580],[222,1580]],"id":7,"label":"ypmc_key","linking":[[7,8]]},{"transcription":"电子天平","points":[[1325,1547],[1536,1547],[1536,1612],[1325,1612]],"id":8,"label":"ypmc_value","linking":[[7,8]]},{"transcription":"型号/规格","points":[[222,1649],[476,1649],[476,1711],[222,1711]],"id":9,"label":"xhgg_key","linking":[[9,10]]},{"transcription":"ES461","points":[[1355,1682],[1514,1682],[1514,1737],[1355,1737]],"id":10,"label":"xhgg_value","linking":[[9,10]]},{"transcription":"制造厂","points":[[225,1781],[395,1781],[395,1835],[225,1835]],"id":11,"label":"zzc_key","linking":[[11,12]]},{"transcription":"HC","points":[[1384,1814],[1469,1814],[1469,1872],[1384,1872]],"id":12,"label":"zzc_value","linking":[[11,12]]},{"transcription":"样品编号","points":[[225,1909],[461,1909],[461,1971],[225,1971]],"id":13,"label":"ypbh_key","linking":[[13,14]]},{"transcription":"/","points":[[1404,1950],[1435,1943],[1446,1987],[1416,1995]],"id":14,"label":"ypbh_value","linking":[[13,14]]},{"transcription":"委托日期","points":[[223,2032],[466,2041],[464,2107],[221,2098]],"id":15,"label":"wtrq_key","linking":[[15,16]]},{"transcription":"2020年08月24日","points":[[1204,2077],[1636,2077],[1636,2128],[1204,2128]],"id":16,"label":"wtrq_value","linking":[[15,16]]}],
class_list_xfun.txt 内容如下:
WTFMC_KEY
WTFMC_VALUE
WTFDZ_KEY
WTFDZ_VALUE
WTDBH_KEY
WTDBH_VALUE
YPMC_KEY
YPMC_VALUE
XHGG_KEY
XHGG_VALUE
ZZC_KEY
ZZC_VALUE
YPBH_KEY
YPBH_VALUE
WTRQ_KEY
WTRQ_VALUE
我修改了/home/aistudio/PaddleOCR/configs/kie/vi_layoutxlm/re_vi_layoutxlm_xfund_zh.yml 这个配置文件

  • VQAReTokenChunk:
    max_seq_len: *max_seq_len
    entities_labels: {"WTFMC_KEY": 1, "WTFMC_VALUE": 2, "WTFDZ_KEY":3, "WTFDZ_VALUE":4, "WTDBH_KEY":5,"WTDBH_VALUE": 6, "YPMC_KEY": 7, "YPMC_VALUE": 8, "XHGG_KEY":9, "XHGG_VALUE":10, "ZZC_KEY":11,"ZZC_VALUE": 12, "YPBH_KEY": 13, "YPBH_VALUE": 14, "WTRQ_KEY":15, "WTRQ_VALUE":16}
    加了entities_labels 这个属性,报错信息如下:
    [2024/08/10 14:00:47] ppocr INFO: During the training process, after the 0th iteration, an evaluation is run every 19 iterations
    Traceback (most recent call last):
    File "/home/aistudio/PaddleOCR/tools/train.py", line 255, in
    main(config, device, logger, vdl_writer, seed)
    File "/home/aistudio/PaddleOCR/tools/train.py", line 208, in main
    program.train(
    File "/home/aistudio/PaddleOCR/tools/program.py", line 342, in train
    preds = model(batch)
    File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1254, in call
    return self.forward(*inputs, **kwargs)
    File "/home/aistudio/PaddleOCR/ppocr/modeling/architectures/base_model.py", line 85, in forward
    x = self.backbone(x)
    File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1254, in call
    return self.forward(*inputs, **kwargs)
    File "/home/aistudio/PaddleOCR/ppocr/modeling/backbones/vqa_layoutlm.py", line 248, in forward
    x = self.model(
    File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1254, in call
    return self.forward(*inputs, **kwargs)
    File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/layoutxlm/modeling.py", line 1412, in forward
    loss, pred_relations = self.extractor(sequence_output, entities, relations)
    File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1254, in call
    return self.forward(*inputs, **kwargs)
    File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/layoutxlm/modeling.py", line 1304, in forward
    relations, entities = self.build_relation(relations, entities)
    File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/layoutxlm/modeling.py", line 1258, in build_relation
    positive_relations = paddle.stack([relation_head, relation_tail], axis=1)
    File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/tensor/manipulation.py", line 1842, in stack
    return _C_ops.stack(x, axis)
    ValueError: (InvalidArgument) x dim number should greater than 0, but received value is: 0
    [Hint: Expected x_dim > 0, but received x_dim:0 <= 0:0.] (at ../paddle/phi/backends/gpu/gpu_launch_config.h:180)

请问这个是什么bug?

Environment

我是在百度studio训练的。
aiofiles==23.2.1
aiohttp==3.9.5
aiosignal==1.3.1
aistudio-sdk @ file:///home/aistudio/aistudio_sdk-0.2.4-py3-none-any.whl#sha256=d93411cc8764e465860cbf2f97f787dddd1548595d4776c97ddf0ea787dedd81
albucore==0.0.13
albumentations==1.4.10
altair==4.2.2
annotated-types==0.6.0
anyio==4.3.0
astor==0.8.1
asttokens==2.4.1
async-timeout==4.0.3
attrdict3==2.0.2
attrs==23.2.0
Babel==2.14.0
bce-python-sdk==0.9.6
beautifulsoup4==4.12.3
blinker==1.7.0
cachetools==5.3.3
certifi==2024.2.2
charset-normalizer==3.3.2
click==8.1.7
colorama==0.4.6
coloredlogs==15.0.1
colorlog==6.8.2
comm==0.2.2
contourpy==1.2.1
cycler==0.12.1
Cython==3.0.11
datasets==2.19.0
debugpy==1.8.1
decorator==5.1.1
dill==0.3.4
easydict==1.13
entrypoints==0.4
exceptiongroup==1.2.1
executing==2.0.1
fastapi==0.110.2
ffmpy==0.3.2
filelock==3.13.4
fire==0.6.0
Flask==3.0.3
Flask-Babel==2.0.0
flatbuffers==24.3.25
fonttools==4.51.0
frozenlist==1.4.1
fsspec==2024.3.1
future==1.0.0
gitdb==4.0.11
GitPython==3.1.43
gradio==3.40.0
gradio_client==0.15.1
gunicorn==22.0.0
h11==0.14.0
httpcore==1.0.5
httpx==0.27.0
huggingface-hub==0.22.2
humanfriendly==10.0
idna==3.7
imageio==2.34.2
imgaug==0.4.0
importlib_metadata==7.1.0
importlib_resources==6.4.0
ipykernel==6.29.4
ipython==8.23.0
itsdangerous==2.2.0
jedi==0.19.1
jieba==0.42.1
Jinja2==3.1.3
joblib==1.4.0
jsonschema==4.21.1
jsonschema-specifications==2023.12.1
jupyter_client==8.6.1
jupyter_core==5.7.2
kiwisolver==1.4.5
lazy_loader==0.4
linkify-it-py==2.0.3
lmdb==1.5.1
lxml==5.2.2
markdown-it-py==2.2.0
MarkupSafe==2.1.5
matplotlib==3.8.4
matplotlib-inline==0.1.7
mdit-py-plugins==0.3.3
mdurl==0.1.1
mpmath==1.3.0
multidict==6.0.5
multiprocess==0.70.12.2
nest-asyncio==1.6.0
networkx==3.3
numpy==1.26.4
onnx==1.16.0
onnxruntime==1.17.3
opencv-contrib-python==4.10.0.84
opencv-python==4.9.0.80
opencv-python-headless==4.10.0.84
opt-einsum==3.3.0
orjson==3.10.1
packaging==24.0
paddle2onnx==1.2.1
paddlefsl==1.1.0
paddlehub==2.4.0
paddlenlp==2.5.2
paddleocr==2.8.1
paddlepaddle-gpu @ file:///tmp/paddlepaddle_gpu-2.5.2-cp310-cp310-linux_x86_64.whl#sha256=2b4a84c853c7c88ddf4984c667bfcb824cc8a28a674448099452f50c686cc1bb
pandas==2.2.2
parso==0.8.4
pexpect==4.9.0
pickleshare==0.7.5
pillow==10.3.0
platformdirs==4.2.0
prettytable==3.10.0
prompt-toolkit==3.0.43
protobuf==3.20.3
psutil==5.9.8
ptyprocess==0.7.0
pure-eval==0.2.2
pyarrow==16.0.0
pyarrow-hotfix==0.6
pybind11==2.12.0
pyclipper==1.3.0.post5
pycryptodome==3.20.0
pydantic==2.7.0
pydantic_core==2.18.1
pydeck==0.9.1
pydub==0.25.1
Pygments==2.17.2
Pympler==1.0.1
pypandoc==1.13
pyparsing==3.1.2
python-dateutil==2.9.0.post0
python-docx==1.1.2
python-multipart==0.0.9
pytz==2024.1
PyYAML==6.0.1
pyzmq==26.0.2
rapidfuzz==3.9.6
rarfile==4.2
referencing==0.34.0
requests==2.31.0
rich==13.7.1
rpds-py==0.18.0
ruff==0.4.1
safetensors==0.4.3
scikit-image==0.24.0
scikit-learn==1.4.2
scipy==1.13.0
semantic-version==2.10.0
semver==3.0.2
sentencepiece==0.2.0
seqeval==1.2.2
shapely==2.0.5
shellingham==1.5.4
six==1.16.0
smmap==5.0.1
sniffio==1.3.1
soupsieve==2.5
stack-data==0.6.3
starlette==0.37.2
streamlit==1.13.0
streamlit-image-comparison==0.0.4
sympy==1.12
termcolor==2.4.0
threadpoolctl==3.4.0
tifffile==2024.7.24
toml==0.10.2
tomli==2.0.1
tomlkit==0.12.0
tool-helpers==0.1.1
toolz==0.12.1
tornado==6.4
tqdm==4.66.2
traitlets==5.14.3
typer==0.12.3
typing_extensions==4.11.0
tzdata==2024.1
tzlocal==5.2
uc-micro-py==1.0.3
urllib3==2.2.1
uvicorn==0.29.0
validators==0.28.3
visualdl==2.4.2
watchdog==4.0.1
wcwidth==0.2.13
websockets==11.0.3
Werkzeug==3.0.2
xxhash==3.4.1
yacs==0.1.8
yarl==1.9.4
zipp==3.19.2

Minimal Reproducible Example

re的配置文件
Global:
use_gpu: True
epoch_num: &epoch_num 130
log_smooth_window: 10
print_batch_step: 10
save_model_dir: ./output/ccic/re_vi_layoutxlm_xfund_zh
save_epoch_step: 2000

evaluation is run every 10 iterations after the 0th iteration

eval_batch_step: [ 0, 19 ]
cal_metric_during_train: False
save_inference_dir:
use_visualdl: False
seed: 2022
infer_img: ppstructure/docs/kie/input/zh_val_21.jpg
save_res_path: ./output/ccic/re/xfund_zh/with_gt
kie_rec_model_dir:
kie_det_model_dir:

Architecture:
model_type: kie
algorithm: &algorithm "LayoutXLM"
Transform:
Backbone:
name: LayoutXLMForRe
pretrained: True
mode: vi
checkpoints:

Loss:
name: LossFromOutput
key: loss
reduction: mean

Optimizer:
name: AdamW
beta1: 0.9
beta2: 0.999
clip_norm: 10
lr:
learning_rate: 0.00005
warmup_epoch: 10
regularizer:
name: L2
factor: 0.00000

PostProcess:
name: VQAReTokenLayoutLMPostProcess

Metric:
name: VQAReTokenMetric
main_indicator: hmean

Train:
dataset:
name: SimpleDataSet
data_dir: train_data/0810_8020/zh_train/image
label_file_list:
- train_data/0810_8020/zh_train/train.json
ratio_list: [ 1.0 ]
transforms:
- DecodeImage: # load image
img_mode: RGB
channel_first: False
- VQATokenLabelEncode: # Class handling label
contains_re: True
algorithm: *algorithm
class_path: &class_path /home/aistudio/PaddleOCR/train_data/0810_8020/class_list_xfun.txt
#class_path: /home/aistudio/PaddleOCR/train_data/0810_8020/class_list_xfun.txt
use_textline_bbox_info: &use_textline_bbox_info True
order_method: &order_method "tb-yx"
- VQATokenPad:
max_seq_len: &max_seq_len 512
return_attention_mask: True
- VQAReTokenRelation:
- VQAReTokenChunk:
max_seq_len: *max_seq_len
entities_labels: {"WTFMC_KEY": 1, "WTFMC_VALUE": 2, "WTFDZ_KEY":3, "WTFDZ_VALUE":4, "WTDBH_KEY":5,"WTDBH_VALUE": 6, "YPMC_KEY": 7, "YPMC_VALUE": 8, "XHGG_KEY":9, "XHGG_VALUE":10, "ZZC_KEY":11,"ZZC_VALUE": 12, "YPBH_KEY": 13, "YPBH_VALUE": 14, "WTRQ_KEY":15, "WTRQ_VALUE":16}
- TensorizeEntitiesRelations:
- Resize:
size: [224,224]
- NormalizeImage:
scale: 1
mean: [ 123.675, 116.28, 103.53 ]
std: [ 58.395, 57.12, 57.375 ]
order: 'hwc'
- ToCHWImage:
- KeepKeys:
keep_keys: [ 'input_ids', 'bbox','attention_mask', 'token_type_ids', 'entities', 'relations'] # dataloader will return list in this order
loader:
shuffle: True
drop_last: False
batch_size_per_card: 2
num_workers: 4

Eval:
dataset:
name: SimpleDataSet
data_dir: train_data/0810_8020/zh_val/image
label_file_list:
- train_data/0810_8020/zh_val/val.json
transforms:
- DecodeImage: # load image
img_mode: RGB
channel_first: False
- VQATokenLabelEncode: # Class handling label
contains_re: True
algorithm: *algorithm
class_path: *class_path
use_textline_bbox_info: *use_textline_bbox_info
order_method: *order_method
- VQATokenPad:
max_seq_len: *max_seq_len
return_attention_mask: True
- VQAReTokenRelation:
- VQAReTokenChunk:
max_seq_len: *max_seq_len
- TensorizeEntitiesRelations:
- Resize:
size: [224,224]
- NormalizeImage:
scale: 1
mean: [ 123.675, 116.28, 103.53 ]
std: [ 58.395, 57.12, 57.375 ]
order: 'hwc'
- ToCHWImage:
- KeepKeys:
keep_keys: [ 'input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'entities', 'relations'] # dataloader will return list in this order
loader:
shuffle: False
drop_last: False
batch_size_per_card: 8
num_workers: 8

Additional

No response

Are you willing to submit a PR?

  • Yes I'd like to help by submitting a PR!
@freezehe freezehe added the bug Something isn't working label Aug 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant