Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When to provide ocr? #34

Open
ApxToTop opened this issue Mar 23, 2021 · 2 comments
Open

When to provide ocr? #34

ApxToTop opened this issue Mar 23, 2021 · 2 comments

Comments

@ApxToTop
Copy link

When does the Ocr function be added?

@sciguy16
Copy link
Member

I don't have a definite timeline for implementing OCR - at the moment finishing the web rendering rewrite is higher priority.

Noting down some initial thoughts for when I get some time for this:

  • There are rust bindings for tesseract-ocr
  • If we opt for tesseract then it'll ideally need to be statically linked on all supported platforms, and we'll need to distribute a set of training data alongside scrying
  • Will tesseract give sensible output if it's given the full RDP captures or would it benefit from having the images somehow cropped down to just the usernames?
  • Would it make sense to implement this at the same time as heuristics for determining the server version (e.g. because different Windows versions have slightly different username displays and it might be more performant to search the image for multiple features at the same time rather than running them through separate pipelines)
  • Does username parsing need the full feature set of tesseract or would a simpler implementation make sense, based on commonly used fonts?

@ApxToTop
Copy link
Author

I don't have a definite timeline for implementing OCR - at the moment finishing the web rendering rewrite is higher priority.

Noting down some initial thoughts for when I get some time for this:

  • There are rust bindings for tesseract-ocr
  • If we opt for tesseract then it'll ideally need to be statically linked on all supported platforms, and we'll need to distribute a set of training data alongside scrying
  • Will tesseract give sensible output if it's given the full RDP captures or would it benefit from having the images somehow cropped down to just the usernames?
  • Would it make sense to implement this at the same time as heuristics for determining the server version (e.g. because different Windows versions have slightly different username displays and it might be more performant to search the image for multiple features at the same time rather than running them through separate pipelines)
  • Does username parsing need the full feature set of tesseract or would a simpler implementation make sense, based on commonly used fonts?
  1. Agree to use tesseract-ocr.
  2. Use tesseract to give a reasonable output and write it to the text.
    Format reference:
    1.1.1.1:3389 user: admin|test os: windows 10
  3. Fonts should include, English, Korean, Russian, Japanese, Chinese.
  4. You should collect some rdp images as training, and I can provide you with a large number of rdp images so that you can perfect them.

It is also suggested that a parameter switch should be added to the html report. If a large number of detections are carried out, html will become extremely large and affect program performance.

Thank you for your selfless dedication.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants