Support for custom resampling #24

tecosaur · 2022-11-25T04:36:34Z

For a problem I'm currently working on, it would be tremendously helpful if I could use a custom resampling method (in my case, a modified stratified bootstrap) to form the training sets used for each "atom" model in the ensemble.

At the moment bagging_fraction is supported, which is essentially special cases the bootstrap sample approach. Perhaps it would be possible for this to be generalized to support any ResamplingStrategy?

The text was updated successfully, but these errors were encountered:

ablaom · 2022-11-27T19:54:45Z

@tecosaur Thanks for the suggestion. Let me see if I understand it.

Each ResamplingStrategy from MLJBase generates a vector of 2-tuples of the form (train, test). I guess your observation is that the current resampling used in EnsembleModel (bagging without replacement) amounts to generating each atomic sample by taking the first (and only) element (train, test) returned by MLJBase.train_test_pairs(Holdout(fraction_train=bagging_fraction, rng=rng) (a vector) and using the indices in train (ignoring test) right? And you are suggesting the ability to do the same with any ResamplingStrategy?

tecosaur · 2022-11-29T04:13:16Z

That is indeed what I'm proposing. This does also tie in with #25 somewhat, in that the test indices could be optionally used for out-of-bag predictions, should the user want it.

ablaom · 2022-12-04T20:35:44Z

Makes sense. It seems to me that we could implement this proposal, incorporating the out-of-bag predictions, and I'd support that.

On the other hand, if a more substantial improvement / re-design is being entertained, then we'd want to incorporate those changes concurrently. I'd support that too, but doubt the core MLJ team has the resources to divert to such a project just now.

@tecosaur Is that something you'd be interested in?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for custom resampling #24

Support for custom resampling #24

tecosaur commented Nov 25, 2022

ablaom commented Nov 27, 2022

tecosaur commented Nov 29, 2022

ablaom commented Dec 4, 2022

Support for custom resampling #24

Support for custom resampling #24

Comments

tecosaur commented Nov 25, 2022

ablaom commented Nov 27, 2022

tecosaur commented Nov 29, 2022

ablaom commented Dec 4, 2022