Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Odd mean_model() and sps behavior #223

Open
smlower opened this issue Aug 30, 2021 · 9 comments
Open

Odd mean_model() and sps behavior #223

smlower opened this issue Aug 30, 2021 · 9 comments

Comments

@smlower
Copy link

smlower commented Aug 30, 2021

In generating 'best-fit' SEDs for 3 different prospector models, I have discovered odd behavior when doing several mean_model() calls. In short, I am using 3 different attenuation curve models and generating the model SEDs for ~500 galaxies, and when the sps() object is loaded once, the model SEDs generated are much different than if the sps() object is loaded between each model call (even though the 3 models are using the same sps/SFH object). In testing, I've found that I can only recreate this behavior if generating SEDs for ~200-300 galaxies, i.e., I don't see the same behavior for 1 galaxy or a test fsps sp fit. I'll attach some (incomplete) code snippets showing the main jist of how I get the model SEDs:

def build_sps(zcontinuous=1, compute_vega_mags=False, **extras):
    from prospect.sources import FastStepBasis
    sps = FastStepBasis(zcontinuous=zcontinuous,
                       compute_vega_mags=compute_vega_mags)
    return sps

#building sps object the first time
sps = build_sps()


from cal_unif import build_model as cal_model
calm = cal_model()
cal_ratio = []
for i in cal.index:
    #here I take the max. likelihood params for model 1 (calzetti, dust_type=2)
    #from a .pkl file that has all of the individual galaxy fits combined
    #where 'obs' would be called for each galaxy in the .pkl file
    cal_best = cal['theta_max'][i]
    cal_mm = calm.mean_model(cal_best, obs, sps)
    cal_ratio.append(np.log10(obs['maggies'] / cal_mm[1]))

#now generating the SEDs for another model (Kriek & Conroy, dust_type=4)
from kc_unif import build_model as kcu_model
kcum = kcu_model()
sps = build_sps() #if this line is not included, the incorrect SEDs are generated
kc_unif_ratio = []
for i in kc_geo.index:
    kcu_best = kc_unif['theta_max'][i]
    kcu_mm = kcum.mean_model(kcu_best, obs, sps)
    kc_unif_ratio.append(np.log10(obs['maggies'] / kcu_mm[1]))

The plots below show the median SED residuals for each model (the first model above corresponds to the purple line in the plots and the second model the orange). Reloading the sps object the second time produces the second plot whereas loading it the first time only produces the first plot. Any ideas about why this occurs?

image
image

@bd-j
Copy link
Owner

bd-j commented Aug 30, 2021

Oh, that's usettling! My guess is that there's a FSPS model parameter in calm that is being set to a non-default value and is cached by the sps object and/or the fsps.StellarPopulations objects, but that parameter does not exist in the kcm model so the value stays at the last non-default value supplied (which is presumably incorrect for the kcm model). Try print(calm); print(kcm) or print(calm.params); print(kcm.params)

@smlower
Copy link
Author

smlower commented Aug 30, 2021

That could be it....does the fact that dust_type changes between models impact anything (I assumed that loading the different models would keep that from being an issue?). Here's the model params:

In [3]: calm.params                                                                                                   
Out[3]: 
{'lumdist': array([1.e-05]),
 'pmetals': array([-99]),
 'imf_type': array([2]),
 'massmet': array([10. , -0.5]),
 'logmass': array([10.]),
 'logzsol': array([-0.5]),
 'sfh': array([3]),
 'mass': array([1.]),
 'agebins': array([[ 0.        ,  8.        ],
        [ 8.        ,  8.52      ],
        [ 8.52      ,  9.03851565],
        [ 9.03851565,  9.55703131],
        [ 9.55703131, 10.07554696],
        [10.07554696, 10.14612804]]),
 'z_fraction': array([0.83333333, 0.8       , 0.75      , 0.66666667, 0.5       ]),
 'dust_type': array([2]),
 'dust2': array([0.5]),
 'add_dust_emission': array([1]),
 'duste_gamma': array([0.01]),
 'duste_umin': array([1.]),
 'duste_qpah': array([5.86]),
 'add_agb_dust_model': array([0])}
In [5]: kcm.params                                                                                                    
Out[5]: 
{'lumdist': array([1.e-05]),
 'pmetals': array([-99]),
 'imf_type': array([2]),
 'massmet': array([10. , -0.5]),
 'logmass': array([10.]),
 'logzsol': array([-0.5]),
 'sfh': array([3]),
 'mass': array([1.]),
 'agebins': array([[ 0.        ,  8.        ],
        [ 8.        ,  8.52      ],
        [ 8.52      ,  9.03851565],
        [ 9.03851565,  9.55703131],
        [ 9.55703131, 10.07554696],
        [10.07554696, 10.14612804]]),
 'z_fraction': array([0.83333333, 0.8       , 0.75      , 0.66666667, 0.5       ]),
 'dust_type': array([4]),
 'dust2': array([0.1]),
 'dust_index': array([-0.5]),
 'add_dust_emission': array([1]),
 'duste_gamma': array([0.01]),
 'duste_umin': array([1.]),
 'duste_qpah': array([5.86]),
 'add_agb_dust_model': array([0])}

@bd-j
Copy link
Owner

bd-j commented Aug 31, 2021

It looks like the only difference in the names of the parameters being supplied is the extra "dust_index" parameter in kcm. But that shouldn't make a difference since a) it's being called after calm.mean_model and b) dust_index is ignored with dust_type=2 anyway. And the changed value of dust_type should get propagated into sps (you can check by printing doing sps.ssp.params["dust_type"] after each kcm.mean_model() call if you skip the second build_sps()

So this is something more insidious, probably still something with improperly caching or updating parameter state. What versions of prospector and python-fsps are you using?

Also, can you check kc_unif_ratio to see if maybe only the first ratio is problematic? After that you could try replacing the second the build_sps call with sps.ssp.params.dirtiness = 2 and see if that fixes the problem. Beyond that will require detailed checking of the various parameter dictionaries (in kcm, sps, and sps.ssp) before and after each call.

@smlower
Copy link
Author

smlower commented Aug 31, 2021

Prospector is a pretty old version (commit ac89a74) as well as python-fsps (commit 6cb5769). It doesn't appear to just be the first ratio that's problematic, but rather all of them. Setting sps.ssp.params.dirtiness = 2 doesn't appear to do anything.

Looking through sps.ssp.params.iteritems() before and after model calls, it looks like some of the dust parameters don't get reset. For instance, here's the dict when build_sps() is called:
dict_items([('add_agb_dust_model', True), ('add_dust_emission', True), ('add_igm_absorption', False), ('add_neb_emission', False), ('add_neb_continuum', True), ('add_stellar_remnants', True), ('redshift_colors', False), ('compute_light_ages', False), ('nebemlineinspec', True), ('smooth_velocity', True), ('smooth_lsf', False), ('cloudy_dust', False), ('agb_dust', 1.0), ('tpagb_norm_type', 2), ('dell', 0.0), ('delt', 0.0), ('redgb', 1.0), ('agb', 1.0), ('fcstar', 1.0), ('fbhb', 0.0), ('sbss', 0.0), ('pagb', 1.0), ('zred', 0.0), ('zmet', 1), ('logzsol', 0.0), ('pmetals', 2.0), ('imf_type', 2), ('imf_upper_limit', 120), ('imf_lower_limit', 0.08), ('imf1', 1.3), ('imf2', 2.3), ('imf3', 2.3), ('vdmc', 0.08), ('mdave', 0.5), ('evtype', -1), ('masscut', 150.0), ('sigma_smooth', 0.0), ('min_wave_smooth', 1000.0), ('max_wave_smooth', 10000.0), ('gas_logu', -2), ('gas_logz', 0.0), ('igm_factor', 1.0), ('sfh', 0), ('tau', 1.0), ('const', 0.0), ('sf_start', 0.0), ('sf_trunc', 0.0), ('tage', 0.0), ('dust_tesc', 7.0), ('fburst', 0.0), ('tburst', 11.0), ('sf_slope', 0.0), ('dust_type', 0), ('dust1', 0.0), ('dust2', 0.0), ('dust_clumps', -99.0), ('frac_nodust', 0.0), ('frac_obrun', 0.0), ('dust_index', -0.7), ('dust1_index', -1.0), ('mwr', 3.1), ('uvb', 1.0), ('wgp1', 1), ('wgp2', 1), ('wgp3', 1), ('duste_gamma', 0.01), ('duste_umin', 1.0), ('duste_qpah', 3.5), ('fagn', 0.0), ('agn_tau', 10.0)])

And after kcm:
dict_items([('add_agb_dust_model', array([0])), ('add_dust_emission', True), ('add_igm_absorption', False), ('add_neb_emission', False), ('add_neb_continuum', True), ('add_stellar_remnants', True), ('redshift_colors', False), ('compute_light_ages', False), ('nebemlineinspec', True), ('smooth_velocity', True), ('smooth_lsf', False), ('cloudy_dust', False), ('agb_dust', 1.0), ('tpagb_norm_type', 2), ('dell', 0.0), ('delt', 0.0), ('redgb', 1.0), ('agb', 1.0), ('fcstar', 1.0), ('fbhb', 0.0), ('sbss', 0.0), ('pagb', 1.0), ('zred', 0.0), ('zmet', 1), ('logzsol', array([-0.51915035])), ('pmetals', array([-99])), ('imf_type', 2), ('imf_upper_limit', 120), ('imf_lower_limit', 0.08), ('imf1', 1.3), ('imf2', 2.3), ('imf3', 2.3), ('vdmc', 0.08), ('mdave', 0.5), ('evtype', -1), ('masscut', 150.0), ('sigma_smooth', 0.0), ('min_wave_smooth', 1000.0), ('max_wave_smooth', 10000.0), ('gas_logu', -2), ('gas_logz', 0.0), ('igm_factor', 1.0), ('sfh', array([3])), ('tau', 1.0), ('const', 0.0), ('sf_start', 0.0), ('sf_trunc', 0.0), ('tage', 14.000000000000005), ('dust_tesc', 7.0), ('fburst', 0.0), ('tburst', 11.0), ('sf_slope', 0.0), ('dust_type', array([4])), ('dust1', 0.0), ('dust2', array([0.35220475])), ('dust_clumps', -99.0), ('frac_nodust', 0.0), ('frac_obrun', 0.0), ('dust_index', array([-0.13629104])), ('dust1_index', -1.0), ('mwr', 3.1), ('uvb', 1.0), ('wgp1', 1), ('wgp2', 1), ('wgp3', 1), ('duste_gamma', array([0.02170314])), ('duste_umin', array([3.16068474])), ('duste_qpah', array([5.86])), ('fagn', 0.0), ('agn_tau', 10.0)])

And then after calm:
dict_items([('add_agb_dust_model', array([0])), ('add_dust_emission', True), ('add_igm_absorption', False), ('add_neb_emission', False), ('add_neb_continuum', True), ('add_stellar_remnants', True), ('redshift_colors', False), ('compute_light_ages', False), ('nebemlineinspec', True), ('smooth_velocity', True), ('smooth_lsf', False), ('cloudy_dust', False), ('agb_dust', 1.0), ('tpagb_norm_type', 2), ('dell', 0.0), ('delt', 0.0), ('redgb', 1.0), ('agb', 1.0), ('fcstar', 1.0), ('fbhb', 0.0), ('sbss', 0.0), ('pagb', 1.0), ('zred', 0.0), ('zmet', 1), ('logzsol', array([-0.73332245])), ('pmetals', array([-99])), ('imf_type', 2), ('imf_upper_limit', 120), ('imf_lower_limit', 0.08), ('imf1', 1.3), ('imf2', 2.3), ('imf3', 2.3), ('vdmc', 0.08), ('mdave', 0.5), ('evtype', -1), ('masscut', 150.0), ('sigma_smooth', 0.0), ('min_wave_smooth', 1000.0), ('max_wave_smooth', 10000.0), ('gas_logu', -2), ('gas_logz', 0.0), ('igm_factor', 1.0), ('sfh', array([3])), ('tau', 1.0), ('const', 0.0), ('sf_start', 0.0), ('sf_trunc', 0.0), ('tage', 14.000000000000005), ('dust_tesc', 7.0), ('fburst', 0.0), ('tburst', 11.0), ('sf_slope', 0.0), ('dust_type', array([4])), ('dust1', 0.0), ('dust2', array([0.35557541])), ('dust_clumps', -99.0), ('frac_nodust', array([0.94452423])), ('frac_obrun', 0.0), ('dust_index', array([-1.39367833])), ('dust1_index', -1.0), ('mwr', 3.1), ('uvb', 1.0), ('wgp1', 1), ('wgp2', 1), ('wgp3', 1), ('duste_gamma', array([0.00032834])), ('duste_umin', array([0.94721741])), ('duste_qpah', array([5.86])), ('fagn', 0.0), ('agn_tau', 10.0)])

One noticeable thing is that dust_type doesn't change for the ssp even though calm should reset it to dust_type=2. So I think you're right, something is improperly caching some of the ssp parameter states.

@bd-j
Copy link
Owner

bd-j commented Aug 31, 2021

The unchanged dust_type is indeed troubling. I tried to create a minimum example to demonstrate the problem but couldn't reproduce. Can you run the following with your prospector/python-fsps versions and see if any of the asserts fail?

import numpy as np
from sedpy.observate import load_filters

import prospect
from prospect.models.templates import TemplateLibrary
from prospect.utils.obsutils import fix_obs


def build_sps(zcontinuous=1, compute_vega_mags=False, **extras):
    from prospect.sources import FastStepBasis
    sps = FastStepBasis(zcontinuous=zcontinuous,
                        compute_vega_mags=compute_vega_mags)
    return sps


def build_model(dust_type=2):
    from prospect.models import SpecModel, templates
    model_params = templates.TemplateLibrary["continuity_sfh"]
    model_params["dust_type"] = dict(init=dust_type, isfree=False, N=1)
    return SpecModel(model_params)


if __name__ == "__main__":

    # dummy obs dictionary
    filters = load_filters([f"sdss_{b}0" for b in "ugriz"])
    nf = len(filters)
    obs = dict(filters=filters, maggies=np.ones(nf), maggies_unc=np.ones(nf)*0.1,
               spec=None, wavelength=None)
    obs = fix_obs(obs)

    # build the SPS
    sps = build_sps()
    print(sps.ssp.params["dust_type"])

    # predict with dust_type=2
    m2 = build_model(dust_type=2)
    assert m2.params["dust_type"] == 2
    p2 = m2.predict(m2.theta, obs=obs, sps=sps)
    assert m2.params["dust_type"] == 2
    assert sps.ssp.params["dust_type"] == 2

    # predict with dust_type=4
    m4 = build_model(dust_type=4)
    assert m4.params["dust_type"] == 4
    p4 = m4.predict(m4.theta, obs=obs, sps=sps)
    assert m4.params["dust_type"] == 4
    assert sps.ssp.params["dust_type"] == 4

    # check m2 didn't change and the predictions were different
    assert m2.params["dust_type"] == 2
    assert np.all(p4[1] != p2[1])

@smlower
Copy link
Author

smlower commented Aug 31, 2021

Yeah, none of these asserts fail for the simple test. And I actually can't reproduce the above sps.ssp.params.iteritems() issue with just a subsample of galaxies (i.e. the dust_type changes to 2 once calm is loaded)

@bd-j
Copy link
Owner

bd-j commented Sep 1, 2021

Sorry, so the issue still persists when iterating over the entire sample? Does it make sense to add asserts inside the loop and see when exactly they fail?

@smlower
Copy link
Author

smlower commented Sep 3, 2021

Sorry for the delay -- I was banging up against trying to reproduce the first instance of the dust_type issue with sps.ssp.params.iteritems() (I am not sure how it happened the first time and wonder if I copied over the wrong thing?) when I realized what the actual issue is: ('frac_nodust', array([0.94452423])) does not get reset between model calls.

@bd-j
Copy link
Owner

bd-j commented Sep 7, 2021

ok, thanks. I'm not sure there's a good way to guard against this in general, since resetting sps.ssp.params to the defaults at each likelihood call and then propagating the model parameters into it would often lead to time-consuming and unnecessary re-computation of the SSPs. I guess an option would be to reset every parameter that's not in model.params.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants