Summarize Anything

A powerful Python-based tool that downloads YouTube videos, transcribes their audio using the DeepInfra API, summarizes the content, and generates both HTML and PDF summaries. It also supports processing existing SRT files for summarization.

Features

Download YouTube Audio: Extracts audio from YouTube videos using yt-dlp.
Transcription: Transcribes audio using the DeepInfra Whisper API with support for chunked processing.
Summarization: Generates detailed summaries of the transcribed content in HTML format.
PDF Generation: Converts HTML summaries to PDF using WeasyPrint.
SRT Processing: Supports processing existing SRT/VTT subtitle files.
Caching: Caches transcriptions to avoid redundant processing.
Logging: Detailed logging for monitoring and debugging.

Prerequisites

1. macOS Setup

This tool relies on several dependencies, including WeasyPrint for PDF generation. Follow the steps below to set up your macOS environment.

a. Install Homebrew

Homebrew is a package manager for macOS. If you don’t have Homebrew installed, open your terminal and run:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Verify the installation:

brew --version

b. Install Required Libraries

WeasyPrint requires several libraries such as Cairo, Pango, GDK-Pixbuf, and GTK+3.

For Intel Macs:

brew install libffi glib gobject-introspection cairo pango gdk-pixbuf gtk+3

For Apple Silicon (M1/M2 Macs):

arch -arm64 brew install libffi glib gobject-introspection cairo pango gdk-pixbuf gtk+3

c. Export Library Paths

Ensure the system can locate the installed libraries by updating your environment variables.

For Intel Macs: Add the following lines to your shell configuration file (e.g., ~/.zshrc or ~/.bash_profile):

export PATH="/usr/local/bin:/usr/local/sbin:$PATH"
export DYLD_LIBRARY_PATH="/usr/local/lib:$DYLD_LIBRARY_PATH"
export PKG_CONFIG_PATH="/usr/local/lib/pkgconfig:$PKG_CONFIG_PATH"

For Apple Silicon (M1/M2 Macs): Add these lines to your shell configuration file:

export PATH="/opt/homebrew/bin:/opt/homebrew/sbin:$PATH"
export DYLD_LIBRARY_PATH="/opt/homebrew/lib:$DYLD_LIBRARY_PATH"
export PKG_CONFIG_PATH="/opt/homebrew/lib/pkgconfig:$PKG_CONFIG_PATH"

Reload the configuration:

source ~/.zshrc  # or source ~/.bash_profile

d. Verify Installation

Check if the required libraries are available:

pkg-config --cflags --libs gobject-2.0 cairo pango

If no errors are reported, the libraries are correctly installed.

2. Python Environment

Ensure you have Python 3.1 or higher installed. You can check your Python version with:

python3 --version

3. API Keys

This tool requires API keys for DeepInfra and OpenRouter. Obtain your API keys from their respective platforms and store them securely.

OpenRouter: https://openrouter.ai/settings/keys

DeepInfra: https://deepinfra.com/dash/api_keys

Installation

Clone the Repository
Create a Virtual Environment

It's recommended to use a virtual environment to manage dependencies.
```
python3 -m venv venv
source venv/bin/activate
```

Install Python Dependencies Install the dependencies:

pip install -r requirements.txt

Alternatively, you can install dependencies directly:

pip install argparse backoff litellm requests webvtt-py yt-dlp python-dotenv pydantic WeasyPrint

Configuration

Environment Variables

Create a .env file in the project root directory and add your API keys:
```
DEEPINFRA_API_KEY=your_deepinfra_api_key
OPENROUTER_API_KEY=your_openrouter_api_key
```
Replace your_deepinfra_api_key and your_openrouter_api_key with your actual API keys.

DEEPINFRA_API_KEY is required only for audio transcription.

Verify WeasyPrint Installation

Ensure that WeasyPrint can generate PDFs by running a test script or using the provided test_weasyprint.py as described in the WeasyPrint Setup Guide.

Usage

The tool can be used to process YouTube videos or existing SRT/VTT files. Below are the instructions for both use cases.

Transcribe and Summarize a YouTube Video

Basic Command
```
python main.py --youtube "https://www.youtube.com/watch?v=your_video_id" --target-language "English"
```
- --youtube: URL of the YouTube video to process.
- --target-language: Target language for the summary (e.g., "English", "Spanish").
Optional Arguments
- --use-subtitles: Use subtitles from YouTube if available (default: True).
- --output-dir: Directory to save output files (default: output).

Example

python main.py --youtube "https://www.youtube.com/watch?v=dQw4w9WgXcQ" --target-language "Russian" --output-dir "results"

Process an Existing SRT File

Command
```
python main.py --srt "/path/to/subtitles.srt" --target-language "French"
```
- --srt: Path to an existing SRT or VTT file.
- --target-language: Target language for the summary.

Example

python main.py --srt "./subtitles/video_subtitles.srt" --target-language "German" --output-dir "summaries"

Output

The tool generates several output files in the specified --output-dir (default is output):

Transcription Files:
- video_id_transcription.txt: Full transcription text.
- video_id_transcription.srt: SRT file with timed subtitles.
Summary Files:
- video_id_summary.html: HTML file containing the summarized content.
- video_id_summary.pdf: PDF version of the HTML summary.

Example:

output/
├── dQw4w9WgXcQ_transcription.txt
├── dQw4w9WgXcQ_transcription.srt
├── dQw4w9WgXcQ_summary.html
└── dQw4w9WgXcQ_summary.pdf

Troubleshooting

Common Issues

WeasyPrint PDF Generation Errors
- Error: OSError: cannot open resource
- Solution: Ensure all required libraries (Cairo, Pango, GDK-Pixbuf, GTK+3) are correctly installed and the environment variables are properly set. Revisit the Prerequisites section.
Missing API Keys
- Error: NoneType related to API keys.
- Solution: Ensure your .env file contains valid DEEPINFRA_API_KEY and OPENROUTER_API_KEY.
YouTube Download Failures
- Error: FileNotFoundError or download-related exceptions.
- Solution: Verify the YouTube URL is correct and accessible. Ensure yt-dlp is up to date:
```
pip install --upgrade yt-dlp
```
Transcription Failures
- Error: API request errors or transcription issues.
- Solution: Check your DeepInfra API key and ensure you have sufficient quota. Review network connectivity and API status.
SRT Parsing Errors
- Error: Invalid SRT format or parsing exceptions.
- Solution: Ensure the SRT/VTT file is correctly formatted. Use tools like Subtitle Edit to validate and fix subtitle files.

Logging

The tool provides detailed logs to help identify issues. Review the console output for error messages and debugging information.

Contributing

Contributions are welcome! Please follow these steps:

Fork the Repository
Create a Feature Branch
```
git checkout -b feature/YourFeature
```
Commit Your Changes
```
git commit -m "Add YourFeature"
```
Push to the Branch
```
git push origin feature/YourFeature
```
Open a Pull Request

License

This project is licensed under the MIT License.

Disclaimer: Ensure you have the rights to download and process YouTube videos. Respect copyright laws and YouTube's terms of service.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.idea		.idea
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
audio_splitter.py		audio_splitter.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Summarize Anything

Table of Contents

Features

Prerequisites

1. macOS Setup

a. Install Homebrew

b. Install Required Libraries

c. Export Library Paths

d. Verify Installation

2. Python Environment

3. API Keys

Installation

Configuration

Usage

Transcribe and Summarize a YouTube Video

Process an Existing SRT File

Output

Troubleshooting

Common Issues

Logging

Contributing

License

About

Releases

Packages

Languages

License

rodion-m/summarize_anything

Folders and files

Latest commit

History

Repository files navigation

Summarize Anything

Table of Contents

Features

Prerequisites

1. macOS Setup

a. Install Homebrew

b. Install Required Libraries

c. Export Library Paths

d. Verify Installation

2. Python Environment

3. API Keys

Installation

Configuration

Usage

Transcribe and Summarize a YouTube Video

Process an Existing SRT File

Output

Troubleshooting

Common Issues

Logging

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages