Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Census Image Classification #2

Open
benwbrum opened this issue Feb 28, 2018 · 2 comments
Open

Census Image Classification #2

benwbrum opened this issue Feb 28, 2018 · 2 comments

Comments

@benwbrum
Copy link
Member

Part of the challenges of presenting images for users to transcribe is knowing which images contain meaningful text (census entries) and which are irrelevant to our purposes (microfilm artifacts, district descriptions, total sheets, signatures). We would like software which can classify images which have entries from other images, whether accomplished through computer vision, fuzzy matching on OCR, or other methodologies.

The sample data presently includes

@cramraj8
Copy link

#6
When we do the resizing part (430, 250) from the original image of size (3000+, 2000+), we are distorting the image. When we do the classification using CNN, there should be some distinct features that differentiate among the different classes. In our problem, the images are full of text and artifacts. So without correctly interpreting our trained model, how can we assure that it is correctly extracting the distinct features? For instance, we use Grad-CAM to localize the distinct feature.

@cramraj8
Copy link

We have two options,

  1. Switch the classification & detection in the pipeline.
  2. We can upsample the images.
    But the (1) will not support our purpose of classification in terms of cost. So upsampling might help us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants