[ENH] Custom Mixing of Oversamplers and Undersamplers #925

BradKML · 2022-09-11T08:22:28Z

Is your feature request related to a problem? Please describe

Currently we are doing research on comparing classifiers' effectiveness on highly imbalanced data:
SVC(kernel="linear", C=0.025), SVC(gamma=2, C=1), GaussianProcessClassifier(1.0 * RBF(1.0)), GaussianNB(), QuadraticDiscriminantAnalysis().
Applied against the following Oversamplers:
SMOTE(), SMOTEN(), ADASYN(), KMeansSMOTE(), SVMSMOTE()

Problem: The credit card fraud dataset has a 1-to-1000 difference https://www.geeksforgeeks.org/ml-credit-card-fraud-detection/

Describe the solution you'd like

Being able to input an oversampler and an undersampler as input for creating SMOTEENN/SMOTETomek-like data. Currently through understanding other tutorials of binary classifiers, that:

If the majority and the minority forms an N-to-1 ratio, then the Oversampler should take a float that is less than 1/N
Afterwards the minority sampler defaults to having sampling_strategy=0.5

Currently I use float(5*sum(y)/y.size) to expand the size of the minority class to be five times its size.

Describe alternatives you've considered

I do not understand why SMOTEENN or SMOTETomek can be combined to a single algorithm without a clear way of having a generic pipeline.

Additional context

https://machinelearningmastery.com/combine-oversampling-and-undersampling-for-imbalanced-classification/

The text was updated successfully, but these errors were encountered:

hayesall · 2022-11-06T16:41:31Z

See #328 and #787

BradKML mentioned this issue Sep 15, 2022

Question: Combining these with Undersampling analyticalmindsltd/smote_variants#59

Closed

hayesall closed this as completed Nov 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Custom Mixing of Oversamplers and Undersamplers #925

[ENH] Custom Mixing of Oversamplers and Undersamplers #925

BradKML commented Sep 11, 2022 •

edited

Loading

hayesall commented Nov 6, 2022

[ENH] Custom Mixing of Oversamplers and Undersamplers #925

[ENH] Custom Mixing of Oversamplers and Undersamplers #925

Comments

BradKML commented Sep 11, 2022 • edited Loading

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Describe alternatives you've considered

Additional context

hayesall commented Nov 6, 2022

BradKML commented Sep 11, 2022 •

edited

Loading