The very basic object crass
operates on is a page with separator
lines. First the page will be cropped into segments based on the
separator lines and afterwards spliced together to a new image.
There must be at least one vertical and one horizontal line that the code
runs correctly. In an additionally preprocessing step, crass
might detect
the rotation of the page and will rotating it to the correct angle.
This process is called "deskewing".
crass
can process either one page or a folder containing several
pages with the same extension. The output files will have the same extension
as the inputs ones.
By default, crass
places the single segments and the debug information resp. images
into the directory "out/..", e.g.
the subdirectory "out/spliced/.." will contain the final spliced images.
The image-file format accepted by crass
is jpg.
- Find the top or bottom horizontal line.
- Compute the deskew angle.
- Rotate to the correct angle.
- Find the top or bottom horizontal line.
- Finds all vertical lines in the middle (by default) of the page.
- Compute the clipping masks.
- There are 5 types of segments:
- h = header
- a = left side separated by a vertical line
- b = right side separated by a vertical line
- c = space between header and vertical line or vertical line and another vertical line
- f = footer
- There are 5 types of segments:
- Crop the single segments (by default: the footer and header information will also be stored)
- Splice the single segments according to a certain order (by default, a then b then a etc. until a c segments forms the end).
You can find more detailed information about the single steps and setting options in the image processing documentation.