"Depending on the author, a million books can be distilled into a single sentence, and a single sentence can conceal a million books, depending on the Author." - Me
SUM is a knowledge distillation platform that harnesses the power of AI, NLP, and ML to extract, analyze, and present insights from vast datasets in a structured, concise, and engaging manner. With access to potentially all kinds of knowledge, the goal is to summarize it into a succinct & dense human-readable form allowing one to "download" tomes quickly whilst doing away with the "fluff" or whatever else you might be thinking of. Use it for Writing, Brainstorming, Copywriting, Semantic Analysis or simply use it for lazy faire. Here is a proof of concept on the amazing Tldraw Computer platform https://computer.tldraw.com/t/7aR3GPvat7gK5s2TRKGnNG
And here is an implementation on the mythical Websim https://websim.ai/p/vvz4uk4ik02f43adxduf/1
SUM (Summarizer) is an advanced tool for knowledge distillation, leveraging cutting-edge AI, NLP, and ML techniques to transform vast datasets into concise and insightful summaries. Key features include:
- Multi-level summarization (tags, sentences, paragraphs)
- Interactive analysis with user feedback
- Temporal analysis for tracking concept and sentiment changes
- Topic modeling for cross-document analysis
- Knowledge Graph construction and visualization
- Multi-lingual support with language detection and translation
- Adaptive parameter adjustment based on user feedback
- Comprehensive text analysis (entity recognition, keyword extraction, sentiment analysis)
- Word cloud generation
- Data export functionality
To install the required libraries, run:
pip install json nltk spacy scikit-learn networkx matplotlib pandas wordcloud textblob gensim langdetect googletrans==3.1.0a0
python -m spacy download en_core_web_lg
python -m nltk.downloader punkt stopwords wordnet
from advanced_summarizer import AdvancedSUM
summarizer = AdvancedSUM()
summarizer.simulate_interactive_analysis()
texts = summarizer.load_data('data.json')
results = summarizer.batch_process(texts)
summarizer.temporal_analysis(results)
summarizer.export_results(results, 'analysis_results.json')
Loads data from a JSON file.
Processes and analyzes a single text with multi-level summarization.
Processes and analyzes a batch of texts with multi-level summarization.
Performs temporal analysis on processed texts.
Generates a word cloud from the text.
Performs topic modeling on a collection of texts.
Translates the text to the target language.
Builds a knowledge graph from identified topics.
Visualizes the knowledge graph using NetworkX and Matplotlib.
Exports analysis results to a JSON file.
We welcome contributions from the community. If you have ideas for improvements or new features, please follow these steps:
- Fork the repository.
- Create a new branch (
git checkout -b feature-branch
). - Make your changes.
- Commit your changes (
git commit -m 'Add some feature'
). - Push to the branch (
git push origin feature-branch
). - Open a pull request.
For any questions, concerns, or suggestions, please reach out via:
- X: https://x.com/Otota0
- Issues: SUM Issues
I look forward to your feedback and contributions!
This project is licensed under the MIT License. See the LICENSE file for details.
Thank you for using SUM! I hope it helps you distill knowledge effortlessly.
Made with ❤️ by ototao