We recommend using the Tool feature of uv to install yadt.
-
First, you need to refer to uv installation to install uv and set up the
PATH
environment variable as prompted. -
Use the following command to install yadt:
uv tool install --python 3.12 yadt
yadt --help
- Use the
yadt
command. For example:
yadt --bing --files example.pdf
# multiple files
yadt --bing --files example1.pdf --files example2.pdf
We still recommend using uv to manage virtual environments.
-
First, you need to refer to uv installation to install uv and set up the
PATH
environment variable as prompted. -
Use the following command to install yadt:
# clone the project
git clone https://github.com/funstory-ai/yadt
# enter the project directory
cd yadt
# install dependencies and run yadt
uv run yadt --help
- Use the
uv run yadt
command. For example:
uv run yadt --bing --files example.pdf
# multiple files
uv run yadt --bing --files example.pdf --files example2.pdf
Tip
The absolute path is recommended.
There a lot projects and teams working on to make document editing and tranlslating easier like:
There are also some solutions to solve specific parts of the problem like:
- layoutreader: the read order of the text block in a pdf
- Surya: the structure of the pdf
This project hope to promote a standard pipeline and interface to solve the problem.
In fact, there are two mainy stage of a PDF parser or translator:
- Parsing: A stage of parsing means to get the structure of the pdf such as text blocks, images, tables, etc.
- Rendering: A stage of rendering means to render the structure into a new pdf or other format.
For a service like mathpix, it will parse the pdf into a structure may be in a XML format, and then render them using a single column reader order as layoutreader does. The bad news is that the orignal structure lost.
Some people will use Adobe PDF Parser because it will generate a Word document and it keep the original structure. But it is some while expensive. And you know, a pdf or word document is not a good for reading in mobile devices.
We offer a intermediate representation of the results from parser and can be rendered into a new pdf or other format. The pipeline is also a plugin-based system which everybody can add their new model, ocr, renderer, etc.
Our fisrt 1.0 version goal is to finish a translation from PDF Reference, Version 1.7 to the following language version:
- Simplified Chinese
- Traditional Chinese
- Japanese
- Spanish
And meet the following requirements:
- layout error less than 1%
- content loss less than 1%
- Parsing errors in the author and reference sections; they get merged into one paragraph after translation.
- Lines are not supported.
- Does not support drop caps.
This project is not yet ready to accept community contributions. Please be patient. Thank you for your support! Community contributions will be open in the future.
However, currently, the following two types of issue reports are especially accepted: