Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scaling issue with ISJ and custom grid #101

Open
psederberg opened this issue Nov 16, 2021 · 1 comment
Open

Scaling issue with ISJ and custom grid #101

psederberg opened this issue Nov 16, 2021 · 1 comment
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@psederberg
Copy link

Hi!

Thanks again for a great library! We've run into an issue where the estimated PDFs are scaled incorrectly for the ISJ method under the following conditions:

  • The data have a narrow standard deviation
  • We provide a wide grid spacing relative to the observed data

This error does not occur with Silverman's bandwidth calculation.

Below I've pasted code for replicating the issue, along with the output. Note how the ISJ approach greatly overestimates the PDF when providing our own evenly-spaced grid points. This does not occur if we narrow the range of the grid points or expand the standard deviation of the random data.

I tried to figure out what might be happening in the source, but I was unable to track down the issue, but it might have something to do with the real_bw calculation giving rise to the L factor. Thus, I'm unable to provide a PR, just an example for replicating the error, which will hopefully be useful in figuring out what's happening.

Thanks!

import numpy as np
from KDEpy import FFTKDE
import matplotlib.pyplot as plt

# set up some params for the example
rmin = -2
rmax = 2
xvals = np.linspace(rmin, rmax, 512)
dat = np.random.normal(loc=0, scale=.05, size=1500)

# calculate with ISJ
isj = FFTKDE(kernel='epa', bw="ISJ").fit(np.array(dat))

# eval with xvals
yval_pdf_ISJ=isj.evaluate(xvals)

# eval with default values
gps, yval_pdf_ISJ_e  = isj.evaluate()

# calc with silverman
silver = FFTKDE(kernel='epa', bw="silverman").fit(np.array(dat))
yval_pdf_silverman = silver.evaluate(xvals)

# plot the data
plt.hist(dat, density=True, bins='auto', alpha=0.5)

# plot the fits
plt.plot(xvals, yval_pdf_ISJ, alpha=0.8, label='ISJ')
plt.plot(gps, yval_pdf_ISJ_e, alpha=0.8, label='ISJe')
plt.plot(xvals, yval_pdf_silverman, alpha=0.8, label='silverman')

# set the range
plt.xlim(-0.5, .5)

# add legend
plt.legend()

image

@tommyod
Copy link
Owner

tommyod commented Nov 17, 2021

Interesting. Thanks for the detailed bug report. I'll look into it when I get the chance.

@tommyod tommyod added the bug Something isn't working label Nov 17, 2021
@tommyod tommyod added the help wanted Extra attention is needed label Jan 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants