Skip to content

OtotaO/SUM

Repository files navigation

SUM Logo

SUM (Summarizer): The Ultimate Knowledge Distiller

"Depending on the author, a million books can be distilled into a single sentence, and a single sentence can conceal a million books, depending on the Author." - Me

Mission Statement

SUM is a knowledge distillation platform that harnesses the power of AI, NLP, and ML to extract, analyze, and present insights from vast datasets in a structured, concise, and engaging manner. With access to potentially all kinds of knowledge, the goal is to summarize it into a succinct & dense human-readable form allowing one to "download" tomes quickly whilst doing away with the "fluff" or whatever else you might be thinking of. Use it for Writing, Brainstorming, Copywriting, Semantic Analysis or simply use it for lazy faire. Here is a proof of concept on the amazing Tldraw Computer platform Screenshot 2025-01-02 at 10 29 19 PM https://computer.tldraw.com/t/7aR3GPvat7gK5s2TRKGnNG

And here is an implementation on the mythical Websim image https://websim.ai/p/vvz4uk4ik02f43adxduf/1

Overview

SUM (Summarizer) is an advanced tool for knowledge distillation, leveraging cutting-edge AI, NLP, and ML techniques to transform vast datasets into concise and insightful summaries. Key features include:

  • Multi-level summarization (tags, sentences, paragraphs)
  • Interactive analysis with user feedback
  • Temporal analysis for tracking concept and sentiment changes
  • Topic modeling for cross-document analysis
  • Knowledge Graph construction and visualization
  • Multi-lingual support with language detection and translation
  • Adaptive parameter adjustment based on user feedback
  • Comprehensive text analysis (entity recognition, keyword extraction, sentiment analysis)
  • Word cloud generation
  • Data export functionality

Installation

To install the required libraries, run:

pip install json nltk spacy scikit-learn networkx matplotlib pandas wordcloud textblob gensim langdetect googletrans==3.1.0a0
python -m spacy download en_core_web_lg
python -m nltk.downloader punkt stopwords wordnet

Usage

1. Initialize the Class

from advanced_summarizer import AdvancedSUM

summarizer = AdvancedSUM()

2. Interactive Analysis

summarizer.simulate_interactive_analysis()

3. Batch Processing

texts = summarizer.load_data('data.json')
results = summarizer.batch_process(texts)

4. Temporal Analysis

summarizer.temporal_analysis(results)

5. Export Results

summarizer.export_results(results, 'analysis_results.json')

Methods

load_data(data_source)

Loads data from a JSON file.

process_and_analyze(text, timestamp=None)

Processes and analyzes a single text with multi-level summarization.

batch_process(texts, timestamps=None)

Processes and analyzes a batch of texts with multi-level summarization.

temporal_analysis(results)

Performs temporal analysis on processed texts.

generate_word_cloud(text)

Generates a word cloud from the text.

perform_topic_modeling(texts, num_topics=5)

Performs topic modeling on a collection of texts.

translate_text(text, target_lang='en')

Translates the text to the target language.

build_knowledge_graph(topics)

Builds a knowledge graph from identified topics.

visualize_knowledge_graph(G)

Visualizes the knowledge graph using NetworkX and Matplotlib.

export_results(results, filename='analysis_results.json')

Exports analysis results to a JSON file.

Contribution Guidelines

We welcome contributions from the community. If you have ideas for improvements or new features, please follow these steps:

  1. Fork the repository.
  2. Create a new branch (git checkout -b feature-branch).
  3. Make your changes.
  4. Commit your changes (git commit -m 'Add some feature').
  5. Push to the branch (git push origin feature-branch).
  6. Open a pull request.

Contact

For any questions, concerns, or suggestions, please reach out via:

I look forward to your feedback and contributions!

License

This project is licensed under the MIT License. See the LICENSE file for details.


Thank you for using SUM! I hope it helps you distill knowledge effortlessly.


Made with ❤️ by ototao

About

Summarizer - The Ultimate Knowledge Distiller

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published