Trackdown markersize bug on David example when downsampling = 1. #812

qh681248 · 2024-10-18T16:54:19Z

What's the problem?

I believe that:

The reason it takes so long to run is not because of image being too large, it is because of the leafsize chosen is too close to the coreset_size
if you fix leafsize issue, the algorithm runs much faster even for downsampling = 1
However, the plot (middle one) does not get generated, it is because of the markersize, s=np.exp(2.0 * coreset_size * herding_weights).reshape(1, -1)
If you downsample by a factor of n, the number of datapoints is reduced by factor of n^2, this means if you choose downsampling size = 8, you get an error (you also get error when you choose n = 2, I don't know why), this is because, at this point you try to obtain a coreset of size (8000/n), from original data of size (36,000/n**2)

Suggested fix:

change the interpolation (using resize area interpolation)
wherever we were dividing by downsample_scale, we divide by downsample_scale ** 2
replace the marker size by 1 (this is only temporary fix, and investigate why this problem exists)
change leafsize to 16,000 or 24,000 from 10,000 (this speeds up the code significantly)

How can we reproduce the issue?

Change leafsize to 16,000 or 24,000 from 10,000 to fix the issue associated with the first point and run the David example with downsample_size = 1 (default)
However, if you now replace the markersize s to 1, it runs perfectly fine

Python version

3.12

Package version

0.2.1

Operating system

Windows

Other packages

No response

Relevant log output

No response

The text was updated successfully, but these errors were encountered:

qh681248 added bug Something isn't working new Something yet to be discussed by development team labels Oct 18, 2024

qh681248 self-assigned this Oct 18, 2024

qh681248 linked a pull request Oct 23, 2024 that will close this issue

:fix: correct plot scaling calculation and leaf parameters #821

Merged

9 tasks

rg936672 mentioned this issue Nov 5, 2024

:fix: correct plot scaling calculation and leaf parameters #821

Merged

9 tasks

rg936672 closed this as completed in #821 Nov 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trackdown markersize bug on David example when downsampling = 1. #812

Trackdown markersize bug on David example when downsampling = 1. #812

qh681248 commented Oct 18, 2024

Trackdown markersize bug on David example when downsampling = 1. #812

Trackdown markersize bug on David example when downsampling = 1. #812

Comments

qh681248 commented Oct 18, 2024

What's the problem?

How can we reproduce the issue?

Python version

Package version

Operating system

Other packages

Relevant log output