Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rfifind and OpenMP #175

Open
wfarah opened this issue Oct 12, 2022 · 4 comments
Open

rfifind and OpenMP #175

wfarah opened this issue Oct 12, 2022 · 4 comments

Comments

@wfarah
Copy link

wfarah commented Oct 12, 2022

Hi @scottransom

It seems to me like rfifind does not utilize any parallelism. I tried the command line with and without -ncpus 6 and it didn't make a difference in runtime. I still get Using 6 threads with OpenMP when I pass -ncpus 6.

In the source code, I see this:
https://github.com/scottransom/presto/blob/master/src/rfifind.c#L121

But I don't see openMP calls or #pragma directives anywhere else.

Am I missing something?

Thanks!

@scottransom
Copy link
Owner

Hey Wael,

You are completely correct (unfortunately).

I added the command line options and included the appropriate headers for OpenMP many years ago in anticipation of a big parallel push. But my initial experiments with speed-ups were terrible. I think it is because I need to restructure how the code collects and merges the "birdies" after running search_fft. The current method is hugely serial and a major anti-parallel bottleneck. In general, I would think we should be able to get rfifind to parallelize quite nicely.

Unfortunately I haven't had the time to prioritize this work, though. So if you or a student wanted to tackle it, I think it would make a really nice computational project. A related addition I've been thinking about is adding generalized spectral kurtosis estimates...

Let me know if you have any thoughts as I'd love to move forward on this.

Scott

@wfarah
Copy link
Author

wfarah commented Oct 13, 2022

Hi Scott,

Thanks for the response!

I guess it will be great to have rfifind parallelized. I probably won't have the time for it now, but maybe can find someone to take a look at this. I will let you know.

In the meantime, rfifind runtime is quite large. I'm running it on an 8bit file, 64us / 0.5 MHz resolution, 30 mins / 1344 channels, with the arguments:

rfifind -time 30.0 -timesig 10.0 -freqsig 4.0 -chanfrac 0.5 -intfrac 0.3

Is there any command line argument I can add that removes some of the processing to make it run faster? I am searching for FRBs, so probably won't need any FFT-based masking.

-- Wael

@scottransom
Copy link
Owner

What I usually do in situations like this is only worry about the bad channels and forget about the time-variable stuff. To do that, I would use just a small chunk of the data (maybe 5-10%?) and run rfifind on it. Then generate an ignorechan list from the rfifind results and pass that to the various prep* commands. If your data is in one long single file, you will have to chop it somehow, but I suspect you can handle that.

The ignorechan stuff is mentioned in the tutorial and also here:
https://github.com/scottransom/presto/blob/master/FAQ.md#what-is-the-difference-between-using--ignorechan-and-explicitly-including-channels-that-you-want-to-zap-in-an-rfifind-mask-using--zapchan-is-one-preferred-over-the-other

@scottransom
Copy link
Owner

Just for reference, I should have put a link in for the FAQ entry about the lack of multi-CPU speed-up, in case others come here looking for answers:
https://github.com/scottransom/presto/blob/master/FAQ.md#many-of-these-routines-are-really-slow-and-they-dont-seem-to-get-faster-using-the--ncpus-option--why-is-that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants