New Semi Supervised Learning Algorithms? #205

hansen7 · 2019-05-15T23:45:52Z

Description

Hi, is there going to be some metric learning algorithm on the semi-supervised direction, utilising both labels/pairwise constraints and unlabelled data to derive the distance metric.

Some References

Locally linear metric adaptation for semi-supervised clustering
Metric Learning from Relative Comparisons by Minimizing Squared Residual
Semi-Supervised Metric Learning Using Pairwise Constraints

wdevazelhes · 2019-05-22T13:08:12Z

Hi @hansen7 , thanks for those references

Metric Learning from Relative Comparisons by Minimizing Squared Residual is LSML, already present in metric-learn, but I didn't see mentioned how to use the unlabeled data in the paper ? (I didn't read it thoroughly though)
But indeed had a quick look for instance at Locally linear metric adaptation for semi-supervised clustering and it seems to be able to use unlabeled data as well as pairwise constraints

So yes I think it would be cool to have these kind of algorithms, I guess at some point we will need to decide what algorithms are a priority for metric-learn, so it's interesting to have these in mind already

Any thoughts @bellet @perimosocordiae @terrytangyuan @nvauquie ?

perimosocordiae · 2019-05-22T13:36:20Z

Note that we also have gh-13 tracking other requested algorithms. Let's keep that list updated as new algorithms are proposed/implemented.

I'm in favor of adding more algorithm diversity to the package, in general. I think our standards can be looser than scikit-learn or scipy's, but we should also be pragmatic and not take on too much. Criteria might include:

A publication with a reasonable number of citations.
A reference implementation or published inputs/outputs that we can validate our version against.
An implementation that doesn't require thousands of lines of new code, or adding new mandatory dependencies.

Of course, any of these three guidelines could be ignored in special cases.

terrytangyuan · 2019-05-22T20:51:53Z

Agree with what @perimosocordiae said above. Just adding my two cents here that we should prioritize the algorithms that have:

Larger number of citations
Common parts that can be reused by other/existing algorithms
Better proven performance over other similar/existing algorithms

bellet · 2019-05-24T17:44:17Z

I am adding the following paper which I think is the most classic semi-supervised metric learning algorithm (using graph regularization through the Laplacian):
https://dl.acm.org/citation.cfm?id=1823752

It would be nice to have such an algorithm in the package at some point. But inputting unlabeled points may require some thinking in terms of API, and the use-cases are perhaps a bit limited.

@hansen7 do you have a personal interest in implementing such semi-supervised methods, or are just simply looking for ideas on what could be included in metric-learn? If it is the latter, indeed gh-13 is a good place to look at. In my opinion, adding the super classic and effective triplet-based approach of https://www.cs.cornell.edu/people/tj/publications/schultz_joachims_03a.pdf would be awesome

hansen7 · 2019-06-05T17:03:19Z

@hansen7 do you have a personal interest in implementing such semi-supervised methods, or are just simply looking for ideas on what could be included in metric-learn? If it is the latter, indeed gh-13 is a good place to look at. In my opinion, adding the super classic and effective triplet-based approach of https://www.cs.cornell.edu/people/tj/publications/schultz_joachims_03a.pdf would be awesome

thanks, actually I have implemented a few semi-supervised algorithms such as SERAPH from here for my research projects, I would be very happy to help develop these methods within the metric-learn module.

perimosocordiae added the new feature label May 22, 2019

mvargas33 mentioned this issue Oct 21, 2021

[DOC] [WIP] Developers documentation page #340

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Semi Supervised Learning Algorithms? #205

New Semi Supervised Learning Algorithms? #205

hansen7 commented May 15, 2019

wdevazelhes commented May 22, 2019 •

edited

Loading

perimosocordiae commented May 22, 2019

terrytangyuan commented May 22, 2019

bellet commented May 24, 2019

hansen7 commented Jun 5, 2019

New Semi Supervised Learning Algorithms? #205

New Semi Supervised Learning Algorithms? #205

Comments

hansen7 commented May 15, 2019

Description

Some References

wdevazelhes commented May 22, 2019 • edited Loading

perimosocordiae commented May 22, 2019

terrytangyuan commented May 22, 2019

bellet commented May 24, 2019

hansen7 commented Jun 5, 2019

wdevazelhes commented May 22, 2019 •

edited

Loading