Skip to content

An interactive web application that extracts structured entities (like names, addresses, dates, etc.) from unstructured documents.

License

Notifications You must be signed in to change notification settings

gramener/entityextraction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Entity Extraction

An interactive web application that extracts structured entities (like names, addresses, dates, etc.) from unstructured documents.

Entity Extraction Screenshot

Features

  • Document Support: Load and analyze various document types, e.g. Property Appraisal Reports, Credit Reports, Title Deeds, Home Loan Agreements
  • Flexible Entity Types: Extract multiple types of entities. For example: Names, Addresses, Dates, Personal identifiers, Document identifiers
  • Interactive UI:
    • Real-time entity highlighting
    • Dark/Light theme support
    • Responsive design
    • Sticky navigation for easy document review

Getting Started

  1. Clone this repository
  2. Serve the files using any static web server
  3. Open index.html in a modern browser

Usage

  1. Select a document type from the dropdown or paste your text
  2. Specify the entities you want to extract (one per line)
  3. Click "Analyze" to process the document
  4. Review the extracted entities in the results panel

Technology Stack

  • Frontend: Vanilla JavaScript with ESM modules
  • Styling: Bootstrap 5.3 with Bootstrap Icons
  • Data Processing: Python (via Pyodide Web Worker)
  • Visualization: D3.js
  • Markdown Processing: Marked
  • Syntax Highlighting: highlight.js

Authentication

  • Requires login through LLM Foundry
  • Authentication handled via llmfoundry.straive.com

Development

Built with modern web standards:

  • ES Modules for JavaScript
  • Async/await for asynchronous operations
  • Web Workers for computation
  • Responsive Bootstrap components

Credits

Designed by Gramener

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

An interactive web application that extracts structured entities (like names, addresses, dates, etc.) from unstructured documents.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published