Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/new default xi #800

Open
wants to merge 3 commits into
base: feature/update-dependencies
Choose a base branch
from

Conversation

Jacks0nJ
Copy link
Collaborator

Changed the default value of xi used in sklearn's OPTIC algorithm from 0.05 to 0.15. The value of xi approximately controls the size of the clusters, with a small xi leading to larger clusters and a larger xi leading to smaller clusters. While 0.05 is the standard value, as recommended in the original OPTICS paper, this value can incorrectly include obvious outliers when the size of each cluster is very small, as often occurs in Zooniverse projects.

The value of 0.15 was chosen after tests with the real data from PRINT project found that obvious outliers (by visual inspection) where identified, while minimising the differences with the previous value of 0.05.

Note that this branch uses the updated OPTICS algorithm, where the _predecessor_correction function had a bug fixed. This bug in fact inadvertently helped remove outliers, including in the unit tests used here. This is how the problem with a too low xi value for the use case here was first identified.

@Jacks0nJ Jacks0nJ requested a review from CKrawczyk January 20, 2025 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant