Screenshot Redaction

Avoid leaking sensitive information when sharing screenshots. Security is hard, let your device try to help you. Using Tesseract and automatic processing of screenshots, simply review, edit, and share redacted versions of your screenshots.

Background

In several Slack groups (chat rooms) I have witnessed people mistakenly share sensitive information like email addresses, home addresses, and phone numbers. As sharing information becomes easier it is becoming increasingly difficult to self monitor. Screenshot Redaction aims to keep users cognizant of what they are potentially sharing automatically and with limited friction.

Building

This project is built with Gradle using the Gradle Wrapper script.

Linux
./gradlew assembleDebug

Windows
gradlew.bat assembleDebug

How Redaction Works

The redaction process is currently mostly static and fairly simple. In the future the process will be more flexible allowing submission of photos for processing or even regions of photos. The process initially uses Tesseract OCR to find words inside the image. Once this process is finished, users are notified of completion. If a user chooses to view the redactions, the currently enabled word dictionaries are applied to the results. Dictionaries can choose to white list or black list with their own internal rules. The end result is a screenshot with zero or more words wrapped in boxes and blacked out.

Trained Languages

Currently, English is the only trained language that is implemented in the application. The application settings holds a placeholder for future languages though no backend functionality exists yet for downloading and using alternate language. To support this feature, dictionaries will first need to be expanded for other languages and the languages training data will likely need to be packaged with the dictionaries to prevent bloating the APK further. If you are interested in trying new languages or updating the English langauge, you can follow the Tesseract guides for training or use the list of prebuilt training data.

Dictionaries

The dictionaries are plain text flat files with one word on each line. Dictionaries should not have any blank lines and should be saved to the app/resources directory using a name beginning with wordlist_ and underscores instead of spaces.

When the project is built, the dictionary will be converted to a MapDB store and save to the project raw resources. Changes to dictionaries are also monitored and will result in the dictionary map being rebuilt.

Dictionaries by themselve have no function and require a concrete implementation. Once implemented the dictionary currently needs to be added to the dictionary provider implementation. This is a termporary limitation that will use simple rules and a mapping dictionary in the future.

Licenses

Licenses are stored in the raw resources and require manual addition to the licensing_list JSON Object.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
app		app
art		art
gradle/wrapper		gradle/wrapper
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.gradle		build.gradle
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Screenshot Redaction

Background

Building

How Redaction Works

Trained Languages

Dictionaries

Licenses

About

Releases

Packages

Languages

License

cryptoroom/Screenshot-Redaction

Folders and files

Latest commit

History

Repository files navigation

Screenshot Redaction

Background

Building

How Redaction Works

Trained Languages

Dictionaries

Licenses

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages