A multi-parameter, multi-model k-fold grid search. Designed to find the most promissing model for your classification problem.
A template to import, treat and export data in the correct format. Highly dependent on your goals.
The classification module containing the classification class.
Current compatible classifiers
- SVC
- DecisionTreeClassifier
- KNeighborsClassifier
- LogisticRegression
- GaussianNB
- RandomForestClassifier
- Perceptron
- SGDClassifier
- XGBClassifier
A script which joins data_preprocessing and GSClassification modules, in order to return statistics about each of the evaluated classifiers and their best parameters.
You need to set:
- Classifiers (comment out the ones you do not wish to include)
- Grid search parameters
This software comes with an example from Kaggle's introductory competition. It will preprocess data, test different parameters for different models (executing k-fold grid searches to avoid overfitting) and return
- a sorted pandas dataframe with the best accuracies and parameters;
- the cumulative accuracy profile (CAP) along with an image representation of the above metrics.
In order to modify parameters of the grid search, check sklearn's GridSearchCV guide.