cis520-final-project TODO Clean & refactor src/app.py Implement known schema detection Try different parameters for feature generation/selection Implement smart epsilon detection for DBSCAN (clustering based on margins b/t text blocks)