Converting Book's Text Into An Audiobook With N Voice Actors

I have done the following sub-tasks using LLMs and "prompt engineering" which won't be trivial at all using traditional NLP tools like SpaCy.

Character identification and name clustering (e.g., "Tom", "Tom Sawyer", "Mr. Sawyer", "Thomas Sawyer" -> TOM_SAWYER)
Referential gender inference (TOM_SAWYER -> he/him/his)
Dialogue attribution or quotation speaker identification with coreference resolution
Producing the audio outputs of dialogues by converting Speech Synthesis Markup Language (SSML) input into audio data and finally sticthing them together as one audiobook.

I have used Gemini 1.5 model for the LLM part inspired by GCP's $150 credits and their integrated TTS API.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
README.md		README.md
audiobook_experiment_llm.ipynb		audiobook_experiment_llm.ipynb
output-text-dialog-audiobook.mp3		output-text-dialog-audiobook.mp3

Provide feedback