[English] | 中文
Easy-NotebookLM is a tool designed to generate Chinese podcasts based on the concept of Google's NotebookLM. It transforms content from PDF documents into natural, conversational-style podcast audio and supports multiple dialogue styles. Currently, it uses the DeepSeek-v3 API for model inference, though you can replace it with any other compatible model interface. For speech synthesis, it leverages the CosyVoice model.
🎙️ NotebookLM is an AI audio content generation tool developed by Google, previously limited to English content only. Easy-NotebookLM aims to provide a simple and user-friendly way to quickly generate high-quality Chinese podcasts from PDF files.
- Python version:
3.12.8
- Other dependencies: see
requirements.txt
- Additional dependency:
ffmpeg
(must be installed manually)
# Install dependencies
pip install -r requirements.txt
# Install ffmpeg (Mac/Linux examples)
brew install ffmpeg # macOS
sudo apt-get install ffmpeg # Ubuntu/Debian
Before running, ensure the following environment variables are configured or set in the code:
DEEPSEEK_API_KEY
(for large language model inference)DASHSCOPE_API_KEY
(optional, for Alibaba Cloud's Bailing platform)
Create a .env
file in the project root directory and add your keys as follows:
DEEPSEEK_API_KEY=""
DASHSCOPE_API_KEY=""
File Name | Description |
---|---|
notelm.py |
Stable base version script for podcast generation, suitable for most users |
notelm_react.py |
Enhanced version with ReAct pattern support; automatically evaluates generated results and regenerates if not up to standard. Still under development |
prompt.txt |
Prompt template file used to control dialogue style |
speech/ |
Directory containing text materials used for podcast generation |
speech_pre/ |
Pre-generated podcast dialogues (both audio and text) |
You can customize the podcast generation process using command line arguments:
Argument | Description |
---|---|
--input |
Path to the input PDF file |
--output_dir |
Output directory for generated files |
--len |
Maximum length (in characters) per dialogue segment |
--minutes |
Target duration of the generated audio (in minutes) |
--count |
Number of dialogue segments to generate |
--prompt_type |
Dialogue style type, default is 'default' . See below for options: |
Type | Description |
---|---|
default |
Default mode: host primarily guides the guest through Q&A |
discussion |
Discussion mode: host has some knowledge and actively interacts with the guest |
teaching |
Teaching mode: dialogue between a teacher and student |
argument |
Debate mode: host and guest have opposing views and engage in discussion |
interview |
Interview mode: interviewer asks questions, interviewee answers; ideal for technical articles |
⚠️ Note: These parameters offer approximate control — actual output may vary slightly.
Basic usage:
python notelm.py --input sample.pdf --output_dir ./output --prompt_type discussion
Enable ReAct mode (experimental):
python notelm_react.py --input sample.pdf --output_dir ./output --count 5
- 🔜 Add support for MCP (upcoming feature)
- 🔜 Support additional models
- 🔜 Add GUI support
- 🔜 Support video generation with embedded subtitles
- 🔜 Support Markdown or Word document inputs
All contributions are welcome! Whether it's feature suggestions, bug reports, or pull requests — we'd love to hear from you!
This project is licensed under the Apache 2.0 License.
- Inspired by Google NotebookLM
- Thanks to open-notebooklm for reference implementation ideas
- Powered by DeepSeek and CosyVoice
If you like this project, feel free to give it a Star, Fork, or share it with others! 🚀
Got any questions or suggestions? Feel free to open an issue or leave a comment.