-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TLDR 531 pdf_txtlayer_reader table fix #380
Conversation
*corresponding tests are in test_module_table_detection
...mage_reader/table_recognizer/table_extractors/concrete_extractors/onepage_table_extractor.py
Outdated
Show resolved
Hide resolved
Давай еще сделаем unit тест с простеньким файликом с табличкой (можно из ишшьи взять) |
...mage_reader/table_recognizer/table_extractors/concrete_extractors/onepage_table_extractor.py
Show resolved
Hide resolved
* deleted path_cells at all * fixed path creating
* one small bug fixed * if debug_modeis not chossen in test config, then test doesn't pass
* path_detect forword fix * some debug_mode and path_debug bugs fixed
dedoc/readers/pdf_reader/pdf_image_reader/table_recognizer/table_utils/img_processing.py
Outdated
Show resolved
Hide resolved
dedoc/readers/pdf_reader/pdf_image_reader/table_recognizer/table_utils/img_processing.py
Outdated
Show resolved
Hide resolved
dedoc/readers/pdf_reader/pdf_image_reader/table_recognizer/table_utils/img_processing.py
Outdated
Show resolved
Hide resolved
dedoc/readers/pdf_reader/pdf_image_reader/table_recognizer/table_utils/img_processing.py
Outdated
Show resolved
Hide resolved
dedoc/readers/pdf_reader/pdf_image_reader/table_recognizer/table_recognizer.py
Show resolved
Hide resolved
@@ -52,7 +52,7 @@ def __init__(self, *, config: dict) -> None: | |||
self.binarizer = AdaptiveBinarizer() | |||
self.ocr = OCRLineExtractor(config=config) | |||
self.logger = config.get("logger", logging.getLogger()) | |||
if self.config.get("debug_mode") and not os.path.exists(self.config["path_debug"]): | |||
if self.config.get("debug_mode", False) and not os.path.exists(self.config["path_debug"]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Насчет debug_mode
:
выбери что тебе больше нравится: config.get("debug_mode")
или config.get("debug_mode", False)
(или что проще править), и давай сделаем везде одинаково. Хотя все равно в процессе разработки кто как будет писать.
Насчет path_debug
лучше просто .get("path_debug")
* path_detect forword style changed * img_processing.pt:146 - get_config()["debug_mode"] fixed * ocr_cell_extractor.py - changed if False to if NoneType is None
* TLDR 531 pdf_txtlayer_reader table fix (#380) * TLDR-538 tesseract trustai (#377) * fixed training script (#383) * TLDR-521 Fix splittext for file names with several dots (#385) * TLDR-527 refactor methods and parameters for all main classes (#387) * Add attach and table annotations to PPTX (#389) * TLDR-544 docx bugs (#382) * TLDR-516 GPU in docker (#384) * new version 2.0 (#390) --------- Co-authored-by: raxtemur <[email protected]> Co-authored-by: Oksana Belyaeva <[email protected]> Co-authored-by: Alexander Golodkov <[email protected]> Co-authored-by: Alexander Golodkov <[email protected]> Co-authored-by: Nikita Shevtsov <[email protected]>
No description provided.