Missing band-passes #1

JohannesBuchner · 2023-08-08T14:30:23Z

Hi all,

Thank you for this repository.

I was able to train a SED model with it, but cannot do inference with it yet.

I am confused about this line:
anpe._x_shape = Ut.x_shape_from_simulation(y_tensor)
which causes a "AssertionError: Observed data shape (torch.Size([1, 33])) must match the shape of simulated data x (torch.Size([1, 29]))." for me when doing:

            x = np.concatenate([y_obs, sig_obs])
            ave_theta = hatp_x_y.sample((run_params['np_baseline'],), x=torch.as_tensor(x.astype(np.float32)).to(device), show_progress_bars=False)

I would expect that hatp_x_y._x_shape has to take the shape of x_tensor[0], given that we insert data values and want posteriors of shape of the parameters, but it is set to the latter?

Secondly, I was wondering about an alternative approach to deal with missing values, namely to set fluxes and errors to some special negative value (e.g. -1) randomly (proportional to their missing fraction) and train with such a modified data set, or introduce an additional indicator vector {0, 1} that indicates whether the observation is present. Then one would not need MC later (which requires ordering of a 1d data set and assumption that there are no strong emission/absorption lines). Have you tried this approach and dismissed it for some reason, does it not work? It seems to me that it would need fewer code lines.

Finally, if I see well, this code is a wrapper around
https://www.mackelab.org/sbi/reference/#sbi.inference.snpe.snpe_c.SNPE_C
and adds handling missing data.
It would be good to encourage users to also cite the original work on SNPE_C listed there and perhaps other foundation papers.
For an example of a suggested list of references, see https://johannesbuchner.github.io/UltraNest/issues.html#how-should-i-cite-ultranest

The text was updated successfully, but these errors were encountered:

wangbingjie · 2023-08-15T15:29:11Z

Hi @JohannesBuchner ,

Sorry for a late response. The confusion is probably caused by my confusing variable names:

x_tensor contains physical parameters, whereas y_tensor contains fluxes & uncertainties. Hence hatp_x_y._x_shape takes on the shape of y_tensor. This is because, as you pointed out, we would insert data values when calling hatp_x_y.sample.

re: your error message. Could I ask how you trained your model? If the model is trained as

anpe.append_simulations([[physical parameters], [fluxes & uncertainties]])

then it would expect an input of a shape of [fluxes & uncertainties] when drawing posterior samples.

re: alternative approach for missing values. We didn't pursue this line of thinking mainly because the number as well as the locations of missing bands in many cases would be random. I agree that this may be a faster solution if these are known a-priori; however I have not tested this.

Please let me know if you have further questions!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing band-passes #1

Missing band-passes #1

JohannesBuchner commented Aug 8, 2023 •

edited

Loading

wangbingjie commented Aug 15, 2023

Missing band-passes #1

Missing band-passes #1

Comments

JohannesBuchner commented Aug 8, 2023 • edited Loading

wangbingjie commented Aug 15, 2023

JohannesBuchner commented Aug 8, 2023 •

edited

Loading