A FastAPI-based project for fetching YouTube video transcripts and processing them with OpenAI's GPT models for summarization and analysis.
- Fetch YouTube video transcripts using the
youtube-transcript-api
. - Summarize and analyze transcripts using OpenAI's GPT models.
- Expose APIs for transcript retrieval and optional processing.
- Python 3.12+
- pip (Python package manager)
git clone https://github.com/your-repo/youtube-transcript-api.git
cd youtube-transcript-api
python -m venv venv
source venv/bin/activate # On Linux/Mac
.\venv\Scripts\activate # On Windows
pip install -r requirements.txt
Create a .env
file in the root directory:
OPENAI_API_KEY=your_openai_api_key_here
Replace your_openai_api_key_here
with your actual OpenAI API key.
Ensure you have Docker and Docker Compose installed on your machine. Then, run the following command to set up the necessary infrastructure:
docker-compose up -d
This will start the required services defined in your docker-compose.yml
file.
Run the following command to apply database migrations using yoyo-migrations
:
yoyo apply
Replace username
, password
, and youtube_transcripts
with your actual database credentials.
Run the FastAPI application using Uvicorn:
uvicorn app.__init__:app --reload
- Swagger UI: http://127.0.0.1:8000/docs
- ReDoc: http://127.0.0.1:8000/redoc
-
Fetch Transcript
GET /transcript/{video_id}
- Parameters:
video_id
(str): The YouTube video ID.
- Response:
{ "video_id": "example_id", "transcript": "Transcript content..." }
- Parameters:
-
Fetch and Summarize Transcript
GET /transcript/{video_id}?summarize=true
- Query Parameter:
summarize
(bool): Set totrue
to include a summary.
- Response:
{ "video_id": "example_id", "transcript": "Transcript content...", "summary": "Summarized content..." }
- Query Parameter:
youtube-transcript-api/
│
├── app/
│ ├── __init__.py # FastAPI app entry point
│ ├── api/ # API routes
│ │ ├── endpoints/
│ │ │ ├── transcript.py # Transcript-related endpoints
│ ├── core/ # Core configurations
│ │ ├── config.py # Config file for environment variables
│ ├── services/ # Business logic
│ │ ├── openai_service.py # Integration with OpenAI API
│ │ ├── transcript_service.py # Logic for transcript fetching
│ ├── utils/ # Utility functions
│
├── tests/ # Tests for your application
├── .env # Environment variables
├── requirements.txt # Dependencies
├── README.md # Project documentation
pytest
Contributions are welcome! Feel free to open issues or submit pull requests.
This project is licensed under the MIT License.
Josep Servat
For any queries, contact at DM on x @servatj or linkedin https://www.linkedin.com/in/servatj/.