Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trackdown markersize bug on David example when downsampling = 1. #812

Open
qh681248 opened this issue Oct 18, 2024 · 0 comments · May be fixed by #821
Open

Trackdown markersize bug on David example when downsampling = 1. #812

qh681248 opened this issue Oct 18, 2024 · 0 comments · May be fixed by #821
Assignees
Labels
bug Something isn't working new Something yet to be discussed by development team

Comments

@qh681248
Copy link
Contributor

What's the problem?

I believe that:

  • The reason it takes so long to run is not because of image being too large, it is because of the leafsize chosen is too close to the coreset_size

  • if you fix leafsize issue, the algorithm runs much faster even for downsampling = 1

  • However, the plot (middle one) does not get generated, it is because of the markersize, s=np.exp(2.0 * coreset_size * herding_weights).reshape(1, -1)

  • If you downsample by a factor of n, the number of datapoints is reduced by factor of n^2, this means if you choose downsampling size = 8, you get an error (you also get error when you choose n = 2, I don't know why), this is because, at this point you try to obtain a coreset of size (8000/n), from original data of size (36,000/n**2)

Suggested fix:

  • change the interpolation (using resize area interpolation)
  • wherever we were dividing by downsample_scale, we divide by downsample_scale ** 2
  • replace the marker size by 1 (this is only temporary fix, and investigate why this problem exists)
  • change leafsize to 16,000 or 24,000 from 10,000 (this speeds up the code significantly)

How can we reproduce the issue?

Change leafsize to 16,000 or 24,000 from 10,000 to fix the issue associated with the first point and run the David example with downsample_size = 1 (default)
However, if you now replace the markersize s to 1, it runs perfectly fine

Python version

3.12

Package version

0.2.1

Operating system

Windows

Other packages

No response

Relevant log output

No response

@qh681248 qh681248 added bug Something isn't working new Something yet to be discussed by development team labels Oct 18, 2024
@qh681248 qh681248 self-assigned this Oct 18, 2024
@qh681248 qh681248 linked a pull request Oct 23, 2024 that will close this issue
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working new Something yet to be discussed by development team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant