Skip to content

Commit

Permalink
[skip ci] Documentation updates
Browse files Browse the repository at this point in the history
  • Loading branch information
felixdittrich92 committed Feb 28, 2024
1 parent 306fe1b commit af81e06
Show file tree
Hide file tree
Showing 33 changed files with 412 additions and 12 deletions.
Binary file modified .doctrees/environment.pickle
Binary file not shown.
21 changes: 21 additions & 0 deletions latest/_sources/using_doctr/using_models.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -279,6 +279,19 @@ For instance, this snippet instantiates an end-to-end ocr_predictor working with
from doctr.model import ocr_predictor
model = ocr_predictor('linknet_resnet18', pretrained=True, assume_straight_pages=False, preserve_aspect_ratio=True)
To modify the output structure you can pass the following arguments to the predictor which will be handled by the underlying `DocumentBuilder`:

* `resolve_lines`: whether words should be automatically grouped into lines (default: True)
* `resolve_blocks`: whether lines should be automatically grouped into blocks (default: True)
* `paragraph_break`: relative length of the minimum space separating paragraphs (default: 0.035)

For example to disable the automatic grouping of lines into blocks:

.. code:: python3
from doctr.model import ocr_predictor
model = ocr_predictor(pretrained=True, resolve_blocks=False)
What should I do with the output?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand All @@ -304,6 +317,14 @@ Here is a typical `Document` layout::
)]
)

To get only the text content of the `Document`, you can use the `render` method::

text_output = result.render()

For reference, here is the output for the `Document` above::

No. RECEIPT DATE

You can also export them as a nested dict, more appropriate for JSON format::

json_output = result.export()
Expand Down
2 changes: 1 addition & 1 deletion latest/searchindex.js

Large diffs are not rendered by default.

19 changes: 19 additions & 0 deletions latest/using_doctr/using_models.html
Original file line number Diff line number Diff line change
Expand Up @@ -836,6 +836,17 @@ <h3>Two-stage approaches<a class="headerlink" href="#two-stage-approaches" title
<span class="n">model</span> <span class="o">=</span> <span class="n">ocr_predictor</span><span class="p">(</span><span class="s1">&#39;linknet_resnet18&#39;</span><span class="p">,</span> <span class="n">pretrained</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">assume_straight_pages</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">preserve_aspect_ratio</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
</pre></div>
</div>
<p>To modify the output structure you can pass the following arguments to the predictor which will be handled by the underlying <cite>DocumentBuilder</cite>:</p>
<ul class="simple">
<li><p><cite>resolve_lines</cite>: whether words should be automatically grouped into lines (default: True)</p></li>
<li><p><cite>resolve_blocks</cite>: whether lines should be automatically grouped into blocks (default: True)</p></li>
<li><p><cite>paragraph_break</cite>: relative length of the minimum space separating paragraphs (default: 0.035)</p></li>
</ul>
<p>For example to disable the automatic grouping of lines into blocks:</p>
<div class="highlight-python3 notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">doctr.model</span> <span class="kn">import</span> <span class="n">ocr_predictor</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">ocr_predictor</span><span class="p">(</span><span class="n">pretrained</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">resolve_blocks</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
</pre></div>
</div>
</section>
<section id="what-should-i-do-with-the-output">
<h3>What should I do with the output?<a class="headerlink" href="#what-should-i-do-with-the-output" title="Permalink to this heading">#</a></h3>
Expand All @@ -859,6 +870,14 @@ <h3>What should I do with the output?<a class="headerlink" href="#what-should-i-
<span class="p">)</span>
</pre></div>
</div>
<p>To get only the text content of the <cite>Document</cite>, you can use the <cite>render</cite> method:</p>
<div class="highlight-python3 notranslate"><div class="highlight"><pre><span></span><span class="n">text_output</span> <span class="o">=</span> <span class="n">result</span><span class="o">.</span><span class="n">render</span><span class="p">()</span>
</pre></div>
</div>
<p>For reference, here is the output for the <cite>Document</cite> above:</p>
<div class="highlight-python3 notranslate"><div class="highlight"><pre><span></span><span class="n">No</span><span class="o">.</span> <span class="n">RECEIPT</span> <span class="n">DATE</span>
</pre></div>
</div>
<p>You can also export them as a nested dict, more appropriate for JSON format:</p>
<div class="highlight-python3 notranslate"><div class="highlight"><pre><span></span><span class="n">json_output</span> <span class="o">=</span> <span class="n">result</span><span class="o">.</span><span class="n">export</span><span class="p">()</span>
</pre></div>
Expand Down
21 changes: 21 additions & 0 deletions v0.1.0/_sources/using_doctr/using_models.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -279,6 +279,19 @@ For instance, this snippet instantiates an end-to-end ocr_predictor working with
from doctr.model import ocr_predictor
model = ocr_predictor('linknet_resnet18', pretrained=True, assume_straight_pages=False, preserve_aspect_ratio=True)
To modify the output structure you can pass the following arguments to the predictor which will be handled by the underlying `DocumentBuilder`:

* `resolve_lines`: whether words should be automatically grouped into lines (default: True)
* `resolve_blocks`: whether lines should be automatically grouped into blocks (default: True)
* `paragraph_break`: relative length of the minimum space separating paragraphs (default: 0.035)

For example to disable the automatic grouping of lines into blocks:

.. code:: python3
from doctr.model import ocr_predictor
model = ocr_predictor(pretrained=True, resolve_blocks=False)
What should I do with the output?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand All @@ -304,6 +317,14 @@ Here is a typical `Document` layout::
)]
)

To get only the text content of the `Document`, you can use the `render` method::

text_output = result.render()

For reference, here is the output for the `Document` above::

No. RECEIPT DATE

You can also export them as a nested dict, more appropriate for JSON format::

json_output = result.export()
Expand Down
2 changes: 1 addition & 1 deletion v0.1.0/searchindex.js

Large diffs are not rendered by default.

19 changes: 19 additions & 0 deletions v0.1.0/using_doctr/using_models.html
Original file line number Diff line number Diff line change
Expand Up @@ -836,6 +836,17 @@ <h3>Two-stage approaches<a class="headerlink" href="#two-stage-approaches" title
<span class="n">model</span> <span class="o">=</span> <span class="n">ocr_predictor</span><span class="p">(</span><span class="s1">&#39;linknet_resnet18&#39;</span><span class="p">,</span> <span class="n">pretrained</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">assume_straight_pages</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">preserve_aspect_ratio</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
</pre></div>
</div>
<p>To modify the output structure you can pass the following arguments to the predictor which will be handled by the underlying <cite>DocumentBuilder</cite>:</p>
<ul class="simple">
<li><p><cite>resolve_lines</cite>: whether words should be automatically grouped into lines (default: True)</p></li>
<li><p><cite>resolve_blocks</cite>: whether lines should be automatically grouped into blocks (default: True)</p></li>
<li><p><cite>paragraph_break</cite>: relative length of the minimum space separating paragraphs (default: 0.035)</p></li>
</ul>
<p>For example to disable the automatic grouping of lines into blocks:</p>
<div class="highlight-python3 notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">doctr.model</span> <span class="kn">import</span> <span class="n">ocr_predictor</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">ocr_predictor</span><span class="p">(</span><span class="n">pretrained</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">resolve_blocks</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
</pre></div>
</div>
</section>
<section id="what-should-i-do-with-the-output">
<h3>What should I do with the output?<a class="headerlink" href="#what-should-i-do-with-the-output" title="Permalink to this heading">#</a></h3>
Expand All @@ -859,6 +870,14 @@ <h3>What should I do with the output?<a class="headerlink" href="#what-should-i-
<span class="p">)</span>
</pre></div>
</div>
<p>To get only the text content of the <cite>Document</cite>, you can use the <cite>render</cite> method:</p>
<div class="highlight-python3 notranslate"><div class="highlight"><pre><span></span><span class="n">text_output</span> <span class="o">=</span> <span class="n">result</span><span class="o">.</span><span class="n">render</span><span class="p">()</span>
</pre></div>
</div>
<p>For reference, here is the output for the <cite>Document</cite> above:</p>
<div class="highlight-python3 notranslate"><div class="highlight"><pre><span></span><span class="n">No</span><span class="o">.</span> <span class="n">RECEIPT</span> <span class="n">DATE</span>
</pre></div>
</div>
<p>You can also export them as a nested dict, more appropriate for JSON format:</p>
<div class="highlight-python3 notranslate"><div class="highlight"><pre><span></span><span class="n">json_output</span> <span class="o">=</span> <span class="n">result</span><span class="o">.</span><span class="n">export</span><span class="p">()</span>
</pre></div>
Expand Down
21 changes: 21 additions & 0 deletions v0.1.1/_sources/using_doctr/using_models.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -279,6 +279,19 @@ For instance, this snippet instantiates an end-to-end ocr_predictor working with
from doctr.model import ocr_predictor
model = ocr_predictor('linknet_resnet18', pretrained=True, assume_straight_pages=False, preserve_aspect_ratio=True)
To modify the output structure you can pass the following arguments to the predictor which will be handled by the underlying `DocumentBuilder`:

* `resolve_lines`: whether words should be automatically grouped into lines (default: True)
* `resolve_blocks`: whether lines should be automatically grouped into blocks (default: True)
* `paragraph_break`: relative length of the minimum space separating paragraphs (default: 0.035)

For example to disable the automatic grouping of lines into blocks:

.. code:: python3
from doctr.model import ocr_predictor
model = ocr_predictor(pretrained=True, resolve_blocks=False)
What should I do with the output?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand All @@ -304,6 +317,14 @@ Here is a typical `Document` layout::
)]
)

To get only the text content of the `Document`, you can use the `render` method::

text_output = result.render()

For reference, here is the output for the `Document` above::

No. RECEIPT DATE

You can also export them as a nested dict, more appropriate for JSON format::

json_output = result.export()
Expand Down
2 changes: 1 addition & 1 deletion v0.1.1/searchindex.js

Large diffs are not rendered by default.

19 changes: 19 additions & 0 deletions v0.1.1/using_doctr/using_models.html
Original file line number Diff line number Diff line change
Expand Up @@ -836,6 +836,17 @@ <h3>Two-stage approaches<a class="headerlink" href="#two-stage-approaches" title
<span class="n">model</span> <span class="o">=</span> <span class="n">ocr_predictor</span><span class="p">(</span><span class="s1">&#39;linknet_resnet18&#39;</span><span class="p">,</span> <span class="n">pretrained</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">assume_straight_pages</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">preserve_aspect_ratio</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
</pre></div>
</div>
<p>To modify the output structure you can pass the following arguments to the predictor which will be handled by the underlying <cite>DocumentBuilder</cite>:</p>
<ul class="simple">
<li><p><cite>resolve_lines</cite>: whether words should be automatically grouped into lines (default: True)</p></li>
<li><p><cite>resolve_blocks</cite>: whether lines should be automatically grouped into blocks (default: True)</p></li>
<li><p><cite>paragraph_break</cite>: relative length of the minimum space separating paragraphs (default: 0.035)</p></li>
</ul>
<p>For example to disable the automatic grouping of lines into blocks:</p>
<div class="highlight-python3 notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">doctr.model</span> <span class="kn">import</span> <span class="n">ocr_predictor</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">ocr_predictor</span><span class="p">(</span><span class="n">pretrained</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">resolve_blocks</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
</pre></div>
</div>
</section>
<section id="what-should-i-do-with-the-output">
<h3>What should I do with the output?<a class="headerlink" href="#what-should-i-do-with-the-output" title="Permalink to this heading">#</a></h3>
Expand All @@ -859,6 +870,14 @@ <h3>What should I do with the output?<a class="headerlink" href="#what-should-i-
<span class="p">)</span>
</pre></div>
</div>
<p>To get only the text content of the <cite>Document</cite>, you can use the <cite>render</cite> method:</p>
<div class="highlight-python3 notranslate"><div class="highlight"><pre><span></span><span class="n">text_output</span> <span class="o">=</span> <span class="n">result</span><span class="o">.</span><span class="n">render</span><span class="p">()</span>
</pre></div>
</div>
<p>For reference, here is the output for the <cite>Document</cite> above:</p>
<div class="highlight-python3 notranslate"><div class="highlight"><pre><span></span><span class="n">No</span><span class="o">.</span> <span class="n">RECEIPT</span> <span class="n">DATE</span>
</pre></div>
</div>
<p>You can also export them as a nested dict, more appropriate for JSON format:</p>
<div class="highlight-python3 notranslate"><div class="highlight"><pre><span></span><span class="n">json_output</span> <span class="o">=</span> <span class="n">result</span><span class="o">.</span><span class="n">export</span><span class="p">()</span>
</pre></div>
Expand Down
21 changes: 21 additions & 0 deletions v0.2.0/_sources/using_doctr/using_models.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -279,6 +279,19 @@ For instance, this snippet instantiates an end-to-end ocr_predictor working with
from doctr.model import ocr_predictor
model = ocr_predictor('linknet_resnet18', pretrained=True, assume_straight_pages=False, preserve_aspect_ratio=True)
To modify the output structure you can pass the following arguments to the predictor which will be handled by the underlying `DocumentBuilder`:

* `resolve_lines`: whether words should be automatically grouped into lines (default: True)
* `resolve_blocks`: whether lines should be automatically grouped into blocks (default: True)
* `paragraph_break`: relative length of the minimum space separating paragraphs (default: 0.035)

For example to disable the automatic grouping of lines into blocks:

.. code:: python3
from doctr.model import ocr_predictor
model = ocr_predictor(pretrained=True, resolve_blocks=False)
What should I do with the output?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand All @@ -304,6 +317,14 @@ Here is a typical `Document` layout::
)]
)

To get only the text content of the `Document`, you can use the `render` method::

text_output = result.render()

For reference, here is the output for the `Document` above::

No. RECEIPT DATE

You can also export them as a nested dict, more appropriate for JSON format::

json_output = result.export()
Expand Down
2 changes: 1 addition & 1 deletion v0.2.0/searchindex.js

Large diffs are not rendered by default.

19 changes: 19 additions & 0 deletions v0.2.0/using_doctr/using_models.html
Original file line number Diff line number Diff line change
Expand Up @@ -836,6 +836,17 @@ <h3>Two-stage approaches<a class="headerlink" href="#two-stage-approaches" title
<span class="n">model</span> <span class="o">=</span> <span class="n">ocr_predictor</span><span class="p">(</span><span class="s1">&#39;linknet_resnet18&#39;</span><span class="p">,</span> <span class="n">pretrained</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">assume_straight_pages</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">preserve_aspect_ratio</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
</pre></div>
</div>
<p>To modify the output structure you can pass the following arguments to the predictor which will be handled by the underlying <cite>DocumentBuilder</cite>:</p>
<ul class="simple">
<li><p><cite>resolve_lines</cite>: whether words should be automatically grouped into lines (default: True)</p></li>
<li><p><cite>resolve_blocks</cite>: whether lines should be automatically grouped into blocks (default: True)</p></li>
<li><p><cite>paragraph_break</cite>: relative length of the minimum space separating paragraphs (default: 0.035)</p></li>
</ul>
<p>For example to disable the automatic grouping of lines into blocks:</p>
<div class="highlight-python3 notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">doctr.model</span> <span class="kn">import</span> <span class="n">ocr_predictor</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">ocr_predictor</span><span class="p">(</span><span class="n">pretrained</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">resolve_blocks</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
</pre></div>
</div>
</section>
<section id="what-should-i-do-with-the-output">
<h3>What should I do with the output?<a class="headerlink" href="#what-should-i-do-with-the-output" title="Permalink to this heading">#</a></h3>
Expand All @@ -859,6 +870,14 @@ <h3>What should I do with the output?<a class="headerlink" href="#what-should-i-
<span class="p">)</span>
</pre></div>
</div>
<p>To get only the text content of the <cite>Document</cite>, you can use the <cite>render</cite> method:</p>
<div class="highlight-python3 notranslate"><div class="highlight"><pre><span></span><span class="n">text_output</span> <span class="o">=</span> <span class="n">result</span><span class="o">.</span><span class="n">render</span><span class="p">()</span>
</pre></div>
</div>
<p>For reference, here is the output for the <cite>Document</cite> above:</p>
<div class="highlight-python3 notranslate"><div class="highlight"><pre><span></span><span class="n">No</span><span class="o">.</span> <span class="n">RECEIPT</span> <span class="n">DATE</span>
</pre></div>
</div>
<p>You can also export them as a nested dict, more appropriate for JSON format:</p>
<div class="highlight-python3 notranslate"><div class="highlight"><pre><span></span><span class="n">json_output</span> <span class="o">=</span> <span class="n">result</span><span class="o">.</span><span class="n">export</span><span class="p">()</span>
</pre></div>
Expand Down
21 changes: 21 additions & 0 deletions v0.2.1/_sources/using_doctr/using_models.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -279,6 +279,19 @@ For instance, this snippet instantiates an end-to-end ocr_predictor working with
from doctr.model import ocr_predictor
model = ocr_predictor('linknet_resnet18', pretrained=True, assume_straight_pages=False, preserve_aspect_ratio=True)
To modify the output structure you can pass the following arguments to the predictor which will be handled by the underlying `DocumentBuilder`:

* `resolve_lines`: whether words should be automatically grouped into lines (default: True)
* `resolve_blocks`: whether lines should be automatically grouped into blocks (default: True)
* `paragraph_break`: relative length of the minimum space separating paragraphs (default: 0.035)

For example to disable the automatic grouping of lines into blocks:

.. code:: python3
from doctr.model import ocr_predictor
model = ocr_predictor(pretrained=True, resolve_blocks=False)
What should I do with the output?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand All @@ -304,6 +317,14 @@ Here is a typical `Document` layout::
)]
)

To get only the text content of the `Document`, you can use the `render` method::

text_output = result.render()

For reference, here is the output for the `Document` above::

No. RECEIPT DATE

You can also export them as a nested dict, more appropriate for JSON format::

json_output = result.export()
Expand Down
2 changes: 1 addition & 1 deletion v0.2.1/searchindex.js

Large diffs are not rendered by default.

Loading

0 comments on commit af81e06

Please sign in to comment.