Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexplainable NFW sampling discrepancy #1023

Open
EiffL opened this issue Sep 6, 2021 · 11 comments
Open

Unexplainable NFW sampling discrepancy #1023

EiffL opened this issue Sep 6, 2021 · 11 comments

Comments

@EiffL
Copy link
Contributor

EiffL commented Sep 6, 2021

Hello hello, while writing our own HOD code with @bhorowitz and others we were trying to compare our results to halotools as God's truth, but were finding some residual differences in the 2pt functions that I can't explain.

I have isolated at least one potential thing that could explain our discrepancies, related to NFW sampling of the satellites. And actually, as far as I understand, I seem to be getting inconsistent results within halotools itself depending on whether I sample manually positions with mc_generate_nfw_radial_positions or if I sample satellites for the catalog normally.

I made a minimal demo notebook here. and I'm getting this sort of discrepancy on the radial distribution of satellites :
image

And this amount of difference actually can explain away the differences I'm seeing downstream in the 2pt function.

@EiffL
Copy link
Contributor Author

EiffL commented Sep 6, 2021

If someone could look at the demo notebook and explain to me why the two histograms don't match, I would be sooooo happy.

@aphearin
Copy link
Contributor

aphearin commented Sep 7, 2021

we were trying to compare our results to halotools as God's truth

I just checked the Contributor List, but I don't see anybody with this name on it. If I run a quick "git blame" check on the relevant section of source code, it looks like the populate_mock function was just written by some guy ;-)

Based on a quick inspection of your notebook (which is very clear, thanks!), it looks like the populate_mock function might not be properly calling the Monte Carlo generator of radial positions. This is an important bug to fix before the next release, so I'll get to this as soon as I can.

@EiffL
Copy link
Contributor Author

EiffL commented Sep 7, 2021

Thanks so much!

@aphearin
Copy link
Contributor

aphearin commented Sep 10, 2021

I'm still not sure, but it looks like this might be related to the use of a lookup table for the concentration. For some values of the manually-overridden concentration, I get agreement with the NFW reference, and for others I don't. Here are two examples where the left-hand panel is the same diagnostic plot you wrote, and the right-hand panel is just the fractional difference, with sign convention defined by (mock - reference) / reference

conc_bug_conc_2 0
conc_bug_conc_10 0
conc_bug_conc_20 0

@aphearin
Copy link
Contributor

If I change the default spacing in the concentration lookup table from dc=1 to dc=0.25, then the discrepancy gets quite a bit smaller, so I think this is the likely culprit

conc_bug_conc_10 0

@aphearin
Copy link
Contributor

@EiffL here's my branch where I tried the more finely spaced concentrations - https://github.com/astropy/halotools/tree/mockpop_bugfix could you try running your tpcf test to see whether this improves the discrepancy you have been finding?

@aphearin
Copy link
Contributor

Actually, you don't need to use a different branch. In the current master branch, you can test this hypothesis by passing in the concentration_bins argument to the PrebuiltHodModelFactory.

@aphearin
Copy link
Contributor

conc_bug_conc_10 0_fine
conc_bug_conc_10 0_coarse

@EiffL
Copy link
Contributor Author

EiffL commented Sep 14, 2021

Thanks a lot @aphearin I tried your fix and it does indeed seem to improve the NFW profile quite a bit. I'm trying to check what happens at the level of the 2pt functitons, and will post plots when I get some convincing results

@EiffL
Copy link
Contributor Author

EiffL commented Sep 14, 2021

Looking good :-)
image
with dc=0.1

vs

image
with the default dc (I guess 1.0)

so yeah, looks like this is solving the problem! And I understand better why my tests at particular conc worked well, but getting some weird results on the conc from the catalog.

But looks like changing dc makes the sampling code way slower for some reason

@aphearin
Copy link
Contributor

so yeah, looks like this is solving the problem!

Great, thanks for independently confirming that this is the issue. I'm glad this turned out to have a simple resolution.

But looks like changing dc makes the sampling code way slower for some reason

Right, yes, there's a trade-off between the size of the lookup table and the performance. I could try to optimize this a bit and see how it goes, though I think the real solution here is for the NFW profile to be reimplemented to be bin-free using the analytical solution for the CDF inverse. Originally, I had thought that it was a good idea to develop this lookup table machinery since it would be more general, covering cases that had no closed-form analytical solution for the inverse CDF, and thinking that people would want to try out all manner of different profiles. But over time, it became clear that people just wanted to use NFW, and so this feature no longer seems so important.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants