Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loading FFTW wisdom from disk is unreliable? #256

Open
mperrin opened this issue May 31, 2018 · 0 comments
Open

loading FFTW wisdom from disk is unreliable? #256

mperrin opened this issue May 31, 2018 · 0 comments
Milestone

Comments

@mperrin
Copy link
Owner

mperrin commented May 31, 2018

Noticed by @douglase . When you import poppy it doesn't seem to be loading the saved wisdom as it should.

I tried changing __init__.py to make sure that fftw_load_wisdom is being called. It is. But the calculation still seems to take longer the first time after reloading.

And, even more confusingly - then it gets slower on subsequent sessions?!? Something is buggy and weird here.

(But we're moving away from FFTW anyway so this may be less important to debug.)


In [1]: import poppy

In [2]: poppy.accel_math._FFTW_INIT
Out[2]: {}

In [3]: poppy.accel_math.benchmark_fft(npix=4096)
Timing performance of FFT for 4096 x 4096, complex128, with 20 iterations
Timing performance in plain numpy:
  0.170 s
Timing performance with FFTW:
  0.619 s
Timing performance with Numexpr + FFTW:
  0.169 s
Timing performance with OpenCL:
  0.299 s
Out[3]:
{'numpy': 0.17025854700041237,
 'fftw': 0.6192405729001621,
 'numexpr': 0.1686754029011354,
 'cuda': nan,
 'opencl': 0.2994044750012108}

In [4]: poppy.utils.fftw_save_wisdom()

In [5]:
Do you really want to exit ([y]/n)? y


(astroconda) mperrin@phoenix ~  > ipython
Python 3.6.5 |Anaconda, Inc.| (default, Apr 26 2018, 08:42:37)
Type 'copyright', 'credits' or 'license' for more information
IPython 6.3.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import poppy

In [2]: poppy.accel_math.benchmark_fft(npix=4096)
Timing performance of FFT for 4096 x 4096, complex128, with 20 iterations
Timing performance in plain numpy:
  0.168 s
Timing performance with FFTW:
  0.453 s
Timing performance with Numexpr + FFTW:
  0.466 s
Timing performance with OpenCL:
  0.291 s
Out[2]:
{'numpy': 0.16818084940023253,
 'fftw': 0.4526968545498676,
 'numexpr': 0.46563783830060856,
 'cuda': nan,
 'opencl': 0.2912771725503262}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant