SmartFoodSelector revolutionizes food analysis and recommendation using cutting-edge AI! 🚀 This modular system combines machine learning, Bayesian inference, and symbolic logic to:
- 🧹 Preprocess and normalize nutritional data.
- 🎯 Cluster products into meaningful groups via k-Means.
- 🤖 Predict categories for new items with Random Forests and Logistic Regression.
- 🔮 Model probabilistic relationships with Bayesian Networks.
- 🧩 Execute logical queries via Prolog knowledge bases.
- Cleaning: Handle missing values and outliers.
- Feature Selection: Focus on core nutritional metrics (
energy_100g
,fat_100g
, etc.). - Normalization: Scale features using
MinMaxScaler
for balanced analysis.
- Elbow Method: Automatically determine optimal clusters with
kneed
. - Visualization: Explore cluster distributions via pie charts and PCA-reduced plots.
- Output: Generate
clustered_dataset.csv
for downstream tasks.
- Models: Train
Decision Trees
,Random Forests
, andLogistic Regression
. - Balancing: Address class imbalance with SMOTE.
- Evaluation: Compare metrics (Accuracy, F1-score, Precision/Recall) and interpret results via SHAP values.
- Learning Curves: Diagnose overfitting/underfitting.
- Continuous & Discrete Models: Learn probabilistic dependencies with
pgmpy
. - Inference: Predict preferences and handle missing data.
- Visualization: Plot dependency graphs for interpretability.
- Automated Generation: Convert clustered data into Prolog facts/rules.
- Query Interface: Use
pyswip
to execute logical rules like:product_info(E, F, C, Su, P, Sa, Cluster)
for cluster lookup.- Custom constraints (e.g.,
high_protein_low_sugar
).
src/
├── dataset_preprocessing.py # Data cleaning/normalization
├── unsupervised_clustering.py # k-Means + Elbow Method
├── supervised_trainer.py # Model training/evaluation
├── bayes_net.py # Bayesian Networks
├── prolog_interface.py # Prolog query handler
└── generate_prolog_knowledge_base.py # Prolog KB generator
main.py # Pipeline coordinator
- Install Dependencies:
pip install pandas scikit-learn pgmpy pyswip optuna shap kneed
- Preprocess Data:
python main.py --preprocess
- Run Clustering:
python main.py --cluster
- Train Models:
python main.py --train
- Launch Prolog KB:
python main.py --prolog
- Clustering: Optimal
k=3
clusters identified via elbow method. - Supervised Models: Random Forest achieved highest accuracy (92%).
- Bayesian Networks: Enabled probabilistic inference under missing data.
- Prolog Integration: Efficiently answered complex nutritional constraints.
- Report issues or suggest enhancements via GitHub Issues.
- Submit Pull Requests for bug fixes/new features.
MIT License. See LICENSE for details.
🌟 Empower your food decisions with AI and logic! 🌟
For full technical details, refer to the project documentation.
Key Updates:
- Added architectural overview and setup instructions.
- Expanded technical details while retaining engaging tone.
- Linked to full documentation for deeper exploration.
- Simplified CLI commands for ease of use.