-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
True Positive and Negative rates in multiclosure weighted fits #2258
base: master
Are you sure you want to change the base?
Conversation
a3aed71
to
a569fd6
Compare
93b23f0
to
b46f020
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is mostly ok. The only thing is that, if possible, I'd try to trim down the helpers since some of the functionality is already in results.py
(or, the one that is not, would be better suited as enhancements there).
""" | ||
Quantile range for computing the true positive rate and true negative rate. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
""" | |
Quantile range for computing the true positive rate and true negative rate. | |
""" | |
# Quantile range for computing the true positive rate and true negative rate. |
I think sphinx is be able to pick up docstrings for variables but they need to be below (see here). I don't think it is needed, but in any case I would either make them into comments or put them below (for this and all others)
import dataclasses | ||
import pandas as pd | ||
import numpy as np | ||
import logging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure you run pre-commit
. To make sure that the hooks are applied to these files you can just add an empty line to these files and then
git add -u
pre-commit
to make sure pre-commit sees them as modified.
@@ -13,3 +13,6 @@ | |||
from validphys.closuretest.multiclosure_preprocessing import * | |||
from validphys.closuretest.multiclosure_pseudodata import * | |||
from validphys.closuretest.inconsistent_closuretest.multiclosure_inconsistent_output import * | |||
from validphys.closuretest.multiclosure_nsigma_helpers import * | |||
from validphys.closuretest.multiclosure_nsigma import * | |||
from validphys.closuretest.multiclosure_nsigma_output import * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this? It would be great to only import the functions that are actually needed. We have to live with those that are already there, but right now if I import closuretest and try to autocomplete I get 301 possibilities, some of them are not really useful (like closuretest.np for numpy...)
@@ -0,0 +1,308 @@ | |||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
""" | |
r""" |
Otherwise you need to escape the \S
|
||
|
||
def set_2(dataspecs_nsigma_alpha: list) -> dict: | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
""" | |
r""" |
|
||
@property | ||
def reduced(self): | ||
return self.value / self.ndata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's already Chi2Data
in results... I wonder whether it'd be better to extend that one? Or maybe use a different name here?
The difference is that this one holds the dataset only while the other holds a whole result object.
(in any case docstr needed)
def central_predictions(dataset: DataSetSpec, pdf: PDF) -> pd.DataFrame: | ||
""" | ||
Computes the central prediction (central PDF member) for a dataset. | ||
|
||
Parameters | ||
---------- | ||
dataset: validphys.core.DataSetSpec | ||
pdf: validphys.core.PDF | ||
|
||
Returns | ||
------- | ||
pd.DataFrame | ||
index is datapoints, column is the central prediction. | ||
""" | ||
return convolution.central_predictions(dataset, pdf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def central_predictions(dataset: DataSetSpec, pdf: PDF) -> pd.DataFrame: | |
""" | |
Computes the central prediction (central PDF member) for a dataset. | |
Parameters | |
---------- | |
dataset: validphys.core.DataSetSpec | |
pdf: validphys.core.PDF | |
Returns | |
------- | |
pd.DataFrame | |
index is datapoints, column is the central prediction. | |
""" | |
return convolution.central_predictions(dataset, pdf) | |
from convolution import central_predictions |
would be the same. You can add the type hints there if you need them.
|
||
value = calc_chi2(sqrt_covmat, diff) | ||
ndata = len(central_predictions) | ||
return CentralChi2Data(value=value, ndata=ndata, dataset=dataset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test it because I'm not 100% sure (maybe you are doing something in the multiclosure that will break this) but in principle this function could just depend on abs_chi2_data
which would automatically get the predictions, the data and the compute the chi2.
|
||
Returns | ||
------- | ||
str or None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It returns True/False
title: Inconsistency classification probabilities | ||
author: Lazy Person | ||
keywords: [nsigma, chi2, multiclosure test, inconsistent] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add perhaps a comment saying that the commented dataset are those to reproduce the paper.
Module for computation of True Positive and Negative rates for flagging a dataset as inconsistent.
When the fit is weighted TPR and TNR are computed taking into account whether adding the weight has deteriorated the overall fit quality.