Skip to content

Welcome to echonotes! This is an exciting and powerful Python application designed to automate the process of extracting handwritten notes from PDFs and summarizing them using a local AI model.

Notifications You must be signed in to change notification settings

nothingmn/echonotes

Repository files navigation

EchoNotes

EchoNotes is a Python-based application that monitors a folder for new files, extracts the content (text, audio, video), summarizes it using a local instance of an LLM model (like Whisper and others), and saves the summarized output back to disk. It supports offline operation and can handle multiple file formats, including PDFs, Word documents, text files, video/audio files.

Features

  • Monitors a directory for new files (PDF, DOCX, TXT, MP4, MP3 formats).
  • Text Extraction:
    • PDF files (via PyPDF2 and Tesseract for OCR)
    • Word documents (via python-docx)
    • Plain text files
    • Audio files (via Whisper for speech-to-text)
    • Video files (audio extracted via FFmpeg and transcribed using Whisper)
  • Summarization:
    • Sends extracted text to a local LLM API for summarization.
    • Supports customizable markdown prompts.
  • Offline Operation:
    • All processing (text extraction, transcription, summarization) can be done offline.
    • Pre-downloads Whisper models and handles everything locally.
  • Logging: Extensive logging to help track operations and errors.

Requirements

Quick Start via Docker

cp config.sample.yml config.yml

Edit config.yml and make sure you enter your correct Ollama endpoint, API token, etc.

Then:

docker run -v /path/to/incoming:/app/incoming -v /path/to/config.yml:/app/config.yml -v /path/to/summarize-notes.md:/app/summarize-notes.md echonotes

For example

 docker run --rm -v "$(pwd)//incoming:/app/incoming" \
         -v "$(pwd)/config.yml:/app/config.yml" \
         -v "$(pwd)//summarize-notes.md:/app/summarize-notes.md" \
         echonotes:latest

Installation from source, via docker.

Docker Setup

  1. Build the Docker Image:

    Clone the repository and build the Docker image:

    docker build -t echonotes .
  2. Run the Docker Container:

    Run the Docker container, mounting the appropriate volumes:

    docker run -v /path/to/incoming:/app/incoming -v /path/to/config.yml:/app/config.yml -v /path/to/summarize-notes.md:/app/summarize-notes.md echonotes
  3. Pre-Download Whisper Models (Optional):

    The Whisper models are automatically downloaded, but you can pre-download them by running:

    docker exec -it <container_id> python -c "import whisper; whisper.load_model('base')"

Docker Compose Example

You can use Docker Compose to manage the container:

version: '3.8'
services:
  echonotes:
    image: echonotes:latest
    volumes:
      - ./incoming:/app/incoming
      - ./config.yml:/app/config.yml
      - ./summarize-notes.md:/app/summarize-notes.md
    restart: unless-stopped

Run the service with:

docker-compose up -d

Usage

EchoNotes monitors the /app/incoming directory for new files. When it detects a new file, it processes it according to the file type:

  • PDF: Extracts text using PyPDF2 or OCR via Tesseract if needed.
  • Word Documents (DOCX): Extracts text using python-docx.
  • Text Files (TXT): Reads the plain text.
  • Audio Files (MP3): Transcribes speech to text using Whisper.
  • Video Files (MP4): Extracts audio using FFmpeg, then transcribes it with Whisper.

Once the text is extracted, it is summarized by sending the text and a customizable markdown prompt to a local LLM API.

Configuration

The application is configured via a config.yml file mounted into the Docker container. An example configuration file is shown below:

api_url: "http://localhost:5000/api/summarize"
bearer_token: "your_api_token_here"
model: "base"
whisper_model: "base" # Specify the Whisper model to use ('tiny', 'base', 'small', 'medium', 'large')

Markdown Prompt Customization

The prompt file (summarize-notes.md) is used to prepend any instructions for summarization. Update it as you see fit.

Logging

The application logs all activities and errors to help with debugging and tracking its operations. The log includes details about:

  • Files processed
  • Errors encountered
  • Summaries generated

Folder Structure

  • incoming: Monitored folder where new files are placed for processing.
  • working: Temporary folder where files are processed.
  • completed: Once processed, files (and summaries) are moved to the completed folder.

Contributing

We welcome contributions to EchoNotes! Please fork the repository and submit a pull request with your changes.

License

EchoNotes is licensed under the MIT License.

About

Welcome to echonotes! This is an exciting and powerful Python application designed to automate the process of extracting handwritten notes from PDFs and summarizing them using a local AI model.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published