Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can docment convert a standalone project? #4

Open
lucasjinreal opened this issue May 12, 2024 · 12 comments
Open

Can docment convert a standalone project? #4

lucasjinreal opened this issue May 12, 2024 · 12 comments

Comments

@lucasjinreal
Copy link

It;s should be very useful it the entrance can be more easy to found

@skyroot
Copy link

skyroot commented May 14, 2024

can u run "python multi_thread_process_to_doc.py ./pdf --process-num 8" success??

@lucasjinreal
Copy link
Author

can not run it on mac M1 CPU

@lucasjinreal
Copy link
Author

python multi_thread_process_to_doc.py ./pdf
usage: multi_thread_process_to_doc.py [-h] --recipe RECIPE [--loop] [--rest] [--build] [--config CONFIG]
multi_thread_process_to_doc.py: error: the following arguments are required: --recipe

@lucasjinreal
Copy link
Author

Why using fastdeploy, it as big as torch but acutally didn't get much speed boost.

@InsaneGe
Copy link

InsaneGe commented May 14, 2024

I have run through it by following commands on Windows Anaconda platform.
conda create -n neo python=3.10
pip install -r requirements.txt
python -m pip install fastdeploy-gpu-python -f https://www.paddlepaddle.org.cn/whl/fastdeploy.html
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Unfortunately I have to tell you is that the effect of pdf-parser is very mediocre, worse than calling the PPStructure when testing the Chinese paper pdf.

@lucasjinreal
Copy link
Author

fastdeploy is the worse lib i have ever seen, it force to cuda11.8 even doesn't have support cuda12, and build from source it has tons of millions source to build even more than pytorch itself.....

I given it up due to my cuda is 12.1 don't want build a new enviroment for this...

@tslyellow
Copy link

I'm running it on a linux system, and I'm getting a little error following the steps you posted (the penultimate line of code isn't executing because of network issues, but executing the last line of code should have the same effect, right?) . Here's a picture of the problem I'm having, have you encountered this? If so, can you tell me how to fix it Thanks for the help!
image

@tslyellow
Copy link

I have run through it by following commands on Windows Anaconda platform. conda create -n neo python=3.10 pip install -r requirements.txt python -m pip install fastdeploy-gpu-python -f https://www.paddlepaddle.org.cn/whl/fastdeploy.html conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Unfortunately I have to tell you is that the effect of pdf-parser is very mediocre, worse than calling the PPStructure when testing the Chinese paper pdf.

哥们 运行完你的四个命令能直接运行代码嘛? 我的还是不行欸,他老说cnstd库的xyxy24p不能导入,我看了一下源码,确实没有 你是怎么解决的呀?

1719384151892

@InsaneGe
Copy link

I have run through it by following commands on Windows Anaconda platform. conda create -n neo python=3.10 pip install -r requirements.txt python -m pip install fastdeploy-gpu-python -f https://www.paddlepaddle.org.cn/whl/fastdeploy.html conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
Unfortunately I have to tell you is that the effect of pdf-parser is very mediocre, worse than calling the PPStructure when testing the Chinese paper pdf.

哥们 运行完你的四个命令能直接运行代码嘛? 我的还是不行欸,他老说cnstd库的xyxy24p不能导入,我看了一下源码,确实没有 你是怎么解决的呀?

1719384151892

It may have something to do with the system platform, I remember I tried it on Ubuntu first, then on Windows it worked as shown above.

@tslyellow
Copy link

tslyellow commented Jun 26, 2024 via email

@wufenglailai
Copy link

@tslyellow
老哥你是咋解决的啊,我在ubuntu上运行,也提示cnstd库的xyxy24p不能导入,是不是cnstd库装错了啊

Traceback (most recent call last):
  File "/mnt/nvme1n1/liuhengyu/miniconda3/envs/map_lhy/lib/python3.10/site-packages/albumentations/check_version.py", line 29, in fetch_version_info
    with opener.open(url, timeout=2) as response:
  File "/mnt/nvme1n1/liuhengyu/miniconda3/envs/map_lhy/lib/python3.10/urllib/request.py", line 519, in open
    response = self._open(req, data)
  File "/mnt/nvme1n1/liuhengyu/miniconda3/envs/map_lhy/lib/python3.10/urllib/request.py", line 536, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/mnt/nvme1n1/liuhengyu/miniconda3/envs/map_lhy/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/mnt/nvme1n1/liuhengyu/miniconda3/envs/map_lhy/lib/python3.10/urllib/request.py", line 1391, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/mnt/nvme1n1/liuhengyu/miniconda3/envs/map_lhy/lib/python3.10/urllib/request.py", line 1351, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 101] Network is unreachable>
Traceback (most recent call last):
  File "/mnt/nvme1n1/liuhengyu/workspace/python/MAP-NEO-main/Matrix/document-convert/multi_thread_process_to_doc.py", line 12, in <module>
    from latex.latex_rec import Latex2Text, sort_boxes
  File "/mnt/nvme1n1/liuhengyu/workspace/python/MAP-NEO-main/Matrix/document-convert/latex/latex_rec.py", line 24, in <module>
    from cnstd.yolov7.general import xyxy24p, box_partial_overlap
ImportError: cannot import name 'xyxy24p' from 'cnstd.yolov7.general' (/mnt/nvme1n1/liuhengyu/miniconda3/envs/map_lhy/lib/python3.10/site-packages/cnstd/yolov7/general.py)

@wufenglailai
Copy link

cnstd==1.2.3
似乎解决了 cannot import name 'xyxy24p' from 'cnstd.yolov7.general' 的报错。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants