Move best_fit.json into *_result.json #159

tylerbarna · 2023-07-21T19:30:33Z

Feature Summary
alter behaviour of the bestfit flag to write the best fit data points to the result.json file rather than a separate file

Usage / behavior
alteration to existing bestfit flag behaviour

Alternative Solutions
If possible, it would be good to "extract" the best fit lightcurve at an earlier point in the fitting rather than calling generate_lightcurve after the fitting has concluded, though I'm unclear as to where this could be accomplished

Implementation details
Make changes to nmma/em/analysis.py

Additional context
Tweaks to pr #147, related to issue #138

The text was updated successfully, but these errors were encountered:

sahiljhawar · 2023-07-21T21:01:35Z

I don't think that bestfit results should go to *_results.json, since it already contains a lot of data. And if someone write script for the first time to extract the data may have to look into the file. And the file size becomes huge for complex analyses.

tylerbarna · 2023-07-21T21:33:01Z

I don't think that bestfit results should go to *_results.json, since it already contains a lot of data.

I don't think that the bestfit lightcurve and associated times will meaningfully add to the size of the result json file

And if someone write script for the first time to extract the data may have to look into the file.

My rationale for adding it to the result json is that the result json file can essentially function as a single file that contains all the necessary information for most post-fitting analysis one would conduct.

Placing the best fit lightcurve inside of the result json wouldn't be all that problematic for new users provided we add a blurb in the documentation about accessing it, especially because up to this point users have had to regenerate the lightcurve if they wanted to do any additional analysis. Because of the structure of json files, the code required to extract a best fit lightcurve from a result json vs a separate file will be almost identical.

And the file size becomes huge for complex analyses.

I usually see mine top out at a few megabytes, but I only really touch the EM side of things. Would this have implications for joint analysis?

tylerbarna · 2023-07-21T21:35:22Z

Alternative Solutions
If possible, it would be good to "extract" the best fit lightcurve at an earlier point in the fitting rather than calling generate_lightcurve after the fitting has concluded, though I'm unclear as to where this could be accomplished

I was looking through the code as well as bilby documentation to figure out where/how result.json is created, though I wasn't able to pinpoint it

mcoughlin · 2023-07-21T21:44:06Z

Or just put everything into an hdf5 file?

sahiljhawar · 2023-07-21T22:02:51Z

I usually see mine top out at a few megabytes, but I only really touch the EM side of things. Would this have implications for joint analysis?

Yes, I think so. I had few runs where the the parameter space was 54D, and the result.json was as big as as 100MB. Though I don't think so that any inference would be this consisting of this large parameters unless I add my thing. However, time would still be O(1) I guess while using the data during post processing. But I think it should be much easy to read the useful metrics if they are directly on the top, instead of searching through a 15+MB json file.

sahiljhawar · 2023-07-21T22:17:00Z

I was looking through the code as well as bilby documentation to figure out where/how result.json is created, though I wasn't able to pinpoint it

I think check bilby/core/result.py L735:
def save_to_file(self, filename=None, overwrite=False, outdir=None, extension='json', gzip=False): . . .

sahiljhawar · 2023-07-26T11:48:57Z

Yes, I think so. I had few runs where the the parameter space was 54D, and the result.json was as big as as 100MB. Though I don't think so that .....

Fetching from a large 50+MB JSON is painstakingly long.

mcoughlin · 2023-07-26T11:52:26Z

Maybe open an issue with bilby if hdf5 files are not supported. It sounds like something our team should consider helping them implement as the error marginalization you are working on could lead to some large parameter spaces.

sahiljhawar · 2023-07-26T12:21:52Z

Okay. I will look into it.

sahiljhawar · 2023-07-26T15:37:08Z

@mcoughlin bilby supports both hdf5 and pickle.

mcoughlin · 2023-07-26T15:46:40Z

@sahiljhawar @tsunhopang maybe one of you move us to an hdf5 setup then?

sahiljhawar · 2023-07-26T15:50:13Z

@mcoughlin Trying a run with hdf5 to see the compression.

tylerbarna · 2023-07-26T16:46:53Z

if we do end up moving to an hdf5 setup, I think it would be good to retain an option to save results to a json

sahiljhawar · 2023-07-26T17:13:23Z

since @tsunhopang already pushed an update with sampler_kwargs. One can pass the following to save results as hdf5. --sampler-kwargs "{'save':'hdf5'}"

sahiljhawar · 2023-07-26T17:17:15Z

if we do end up moving to an hdf5 setup, I think it would be good to retain an option to save results to a json

We don't need to, can be left upon the user to decide.

sahiljhawar · 2023-07-26T21:06:50Z

hdf5 will not work if using MPI, and the pickle file is too large. Let's stick to the default JSON format. Additionally, add the bestfit parameters to a separate JSON file for easy access, just like in the previous commit. Perhaps consider marking this issue as wontfix, but keep it open.

tylerbarna added the enhancement New feature or request label Jul 21, 2023

tylerbarna added this to the Analysis Tools milestone Jul 21, 2023

tylerbarna self-assigned this Jul 21, 2023

mcoughlin assigned tsunhopang Jul 21, 2023

tylerbarna added the wontfix This will not be worked on label Aug 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move best_fit.json into *_result.json #159

Move best_fit.json into *_result.json #159

tylerbarna commented Jul 21, 2023

sahiljhawar commented Jul 21, 2023 •

edited

Loading

tylerbarna commented Jul 21, 2023

tylerbarna commented Jul 21, 2023

mcoughlin commented Jul 21, 2023

sahiljhawar commented Jul 21, 2023 •

edited

Loading

sahiljhawar commented Jul 21, 2023

sahiljhawar commented Jul 26, 2023

mcoughlin commented Jul 26, 2023

sahiljhawar commented Jul 26, 2023

sahiljhawar commented Jul 26, 2023

mcoughlin commented Jul 26, 2023

sahiljhawar commented Jul 26, 2023

tylerbarna commented Jul 26, 2023

sahiljhawar commented Jul 26, 2023 •

edited

Loading

sahiljhawar commented Jul 26, 2023

sahiljhawar commented Jul 26, 2023

Move best_fit.json into *_result.json #159

Move best_fit.json into *_result.json #159

Comments

tylerbarna commented Jul 21, 2023

sahiljhawar commented Jul 21, 2023 • edited Loading

tylerbarna commented Jul 21, 2023

tylerbarna commented Jul 21, 2023

mcoughlin commented Jul 21, 2023

sahiljhawar commented Jul 21, 2023 • edited Loading

sahiljhawar commented Jul 21, 2023

sahiljhawar commented Jul 26, 2023

mcoughlin commented Jul 26, 2023

sahiljhawar commented Jul 26, 2023

sahiljhawar commented Jul 26, 2023

mcoughlin commented Jul 26, 2023

sahiljhawar commented Jul 26, 2023

tylerbarna commented Jul 26, 2023

sahiljhawar commented Jul 26, 2023 • edited Loading

sahiljhawar commented Jul 26, 2023

sahiljhawar commented Jul 26, 2023

sahiljhawar commented Jul 21, 2023 •

edited

Loading

sahiljhawar commented Jul 21, 2023 •

edited

Loading

sahiljhawar commented Jul 26, 2023 •

edited

Loading