GitHub - SergeySetti/aibreakfast: This project demonstrates text embeddings and semantic analysis techniques using Python. It starts with handcrafted vectors for simple concepts (queen, king, cat, etc.), then moves to real-world applications using the SentenceTransformer model.

This project demonstrates various applications of text embeddings and semantic analysis using Python. Here's a breakdown of the key components and techniques demonstrated:

Initial Vector Space Demonstration
- Creates a handcrafted vector space for concepts like "queen", "king", "cat", "dog", "lion_king", and "castle"
- Each concept is represented by 6 dimensions: kingdom, male, is_human, is_animal, is_fictional, and size
- Visualizes these vectors in 2D and 3D space using PCA (Principal Component Analysis)
Real-world Application
- Uses the SentenceTransformer model 'all-mpnet-base-v2' for generating text embeddings
- Works with a dataset of animal names stored in 'animals.parquet'
- Demonstrates several key applications:
Key Features:
- Similarity Search: Shows how to find semantically similar terms using cosine similarity
- Clustering: Uses K-means clustering to group similar concepts
- Dimensionality Reduction: Applies PCA to visualize high-dimensional embeddings in 2D/3D space
- Semantic Analysis: Analyzes relationships between words and concepts
- Abstract Concept Analysis: Works with a dataset of abstract nouns to find semantic relationships
Notable Examples:
- Demonstrates finding similar animals to "birds"
- Shows clustering of animal concepts
- Analyzes semantic aspects of text (like laptop review analysis)
- Maps relationships between concrete and abstract concepts

The project effectively shows how modern NLP techniques can be used to understand and analyze semantic relationships between words and concepts quantitatively, making it useful for applications like semantic search, content recommendation, and text analysis.

Let me know if you'd like me to elaborate on any of these aspects or explain specific parts of the implementation in more detail.

I'm happy to provide more information or answer any questions you may have.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
src		src
.gitignore		.gitignore
README.md		README.md
demo.ipynb		demo.ipynb
requirements.txt		requirements.txt
vectors_busines_cases.ipynb		vectors_busines_cases.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

SergeySetti/aibreakfast

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages