Introducing “Simple Classifier”, a user-friendly library designed for individuals new to machine learning and seeking an accessible way to explore and experiment with various classification algorithms. This library streamlines the entire machine learning process, from data preparation and preprocessing to training, evaluation, and comparison of different algorithms, without requiring extensive coding knowledge.
Simple Classifier not only allows users to easily customize the splitting method for their specific needs, but also performs hyperparameter tuning to find the most optimal parameters for each model. With a comprehensive performance comparison, users can effortlessly identify the best algorithm for their dataset, making this library an essential tool for those starting their journey in machine learning.
Check out the Getting Started section for further information.
This project needs at least Python 3.9. Create a virtual environment with your favorite tool (e.g. conda
or virtualenv
), and install the dependencies:
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
Before running any commands, it is crucial to set up the project’s root directory in your PYTHONPATH environment variable to ensure the correct functioning of the Simple Classifier library. This is necessary because the project uses absolute import paths to pass the CI test.
Required Setup Steps:
-
Open a terminal in the project root directory.
-
For Linux and Mac users, run the following command to add the project root directory to your PYTHONPATH:
export PYTHONPATH=$PYTHONPATH:$(pwd)
By following these steps, you can avoid import errors related to the Simple Classifier library. Now, you should be able to use the library without any issues. Make sure to perform this setup before running any commands or executing any code related to the Simple Classifier library.
simpleclassifier
can be executed as a cli tool. To run it, use the following command:
python simpleclassifier -y <PATH_TO_YAML_CONFIG_FILE>
Sample YAML configuration files can be found in samples. For example, to run the knn
model on the diabetes
dataset, use the following command:
classifier_names:
- knn
dataset_name: diabetes
splitting_strategy: percentage
test_size: 0.50
profile_metrics:
- accuracy
display_format: dump
To run it, use the following command:
python simpleclassifier -y samples/knn_breast_cancer_percentage_50_accuracy_dump.yml
And watch the magic happens!
Training all classifiers...
- KNNClassifier [Done]
Profiling all classifiers...
- KNNClassifier [Done]
Displaying results...
=====================================
-------------------------------------
accuracy
- KNNClassifier: 0.968421052631579
=====================================
The official documentation is built using Sphinx and hosted on this website.
You can build the documentation source files by running these commands
cd docs
make html
You can find the final report for this project here or you can download it from the root directory of this repository.
- Junda Ai [email protected]
- Brian Catraguna [email protected]
- Jayesh Gaur [email protected]
- Shreyash Gondane [email protected]
- Gunjan Hirenkumar Dayani [email protected]