True Positive and Negative rates in multiclosure weighted fits #2258

comane · 2025-01-17T15:14:33Z

Module for computation of True Positive and Negative rates for flagging a dataset as inconsistent.

When the fit is weighted TPR and TNR are computed taking into account whether adding the weight has deteriorated the overall fit quality.

Add arXiv reference once paper is out
notation of various sets should be consistent with the one used in the paper (currently S2 <-> S3)
polish plotting functions

…ent dataset

scarlehoff

I think it is mostly ok. The only thing is that, if possible, I'd try to trim down the helpers since some of the functionality is already in results.py (or, the one that is not, would be better suited as enhancements there).

scarlehoff · 2025-03-26T15:37:26Z

validphys2/src/validphys/closuretest/multiclosure_nsigma.py

+"""
+Quantile range for computing the true positive rate and true negative rate.
+"""


Suggested change

"""

Quantile range for computing the true positive rate and true negative rate.

"""

# Quantile range for computing the true positive rate and true negative rate.

I think sphinx is be able to pick up docstrings for variables but they need to be below (see here). I don't think it is needed, but in any case I would either make them into comments or put them below (for this and all others)

scarlehoff · 2025-03-26T17:56:04Z

validphys2/src/validphys/closuretest/multiclosure_nsigma_helpers.py

+import dataclasses
+import pandas as pd
+import numpy as np
+import logging


Make sure you run pre-commit. To make sure that the hooks are applied to these files you can just add an empty line to these files and then

git add -u pre-commit

to make sure pre-commit sees them as modified.

scarlehoff · 2025-03-26T17:58:21Z

validphys2/src/validphys/closuretest/__init__.py

@@ -13,3 +13,6 @@
 from validphys.closuretest.multiclosure_preprocessing import *
 from validphys.closuretest.multiclosure_pseudodata import *
 from validphys.closuretest.inconsistent_closuretest.multiclosure_inconsistent_output import *
+from validphys.closuretest.multiclosure_nsigma_helpers import *
+from validphys.closuretest.multiclosure_nsigma import *
+from validphys.closuretest.multiclosure_nsigma_output import *


Do we really need this? It would be great to only import the functions that are actually needed. We have to live with those that are already there, but right now if I import closuretest and try to autocomplete I get 301 possibilities, some of them are not really useful (like closuretest.np for numpy...)

scarlehoff · 2025-03-26T17:59:24Z

validphys2/src/validphys/closuretest/multiclosure_nsigma.py

@@ -0,0 +1,308 @@
+"""


Suggested change

"""

r"""

Otherwise you need to escape the \S

scarlehoff · 2025-03-26T17:59:55Z

validphys2/src/validphys/closuretest/multiclosure_nsigma.py

+
+
+def set_2(dataspecs_nsigma_alpha: list) -> dict:
+    """


Suggested change

"""

r"""

scarlehoff · 2025-03-26T18:09:12Z

validphys2/src/validphys/closuretest/multiclosure_nsigma_helpers.py

+
+    @property
+    def reduced(self):
+        return self.value / self.ndata


There's already Chi2Data in results... I wonder whether it'd be better to extend that one? Or maybe use a different name here?

The difference is that this one holds the dataset only while the other holds a whole result object.

(in any case docstr needed)

scarlehoff · 2025-03-26T18:22:30Z

validphys2/src/validphys/closuretest/multiclosure_nsigma_helpers.py

+def central_predictions(dataset: DataSetSpec, pdf: PDF) -> pd.DataFrame:
+    """
+    Computes the central prediction (central PDF member) for a dataset.
+
+    Parameters
+    ----------
+    dataset: validphys.core.DataSetSpec
+    pdf: validphys.core.PDF
+
+    Returns
+    -------
+    pd.DataFrame
+        index is datapoints, column is the central prediction.
+    """
+    return convolution.central_predictions(dataset, pdf)


Suggested change

def central_predictions(dataset: DataSetSpec, pdf: PDF) -> pd.DataFrame:

"""

Computes the central prediction (central PDF member) for a dataset.

Parameters

----------

dataset: validphys.core.DataSetSpec

pdf: validphys.core.PDF

Returns

-------

pd.DataFrame

index is datapoints, column is the central prediction.

"""

return convolution.central_predictions(dataset, pdf)

from convolution import central_predictions

would be the same. You can add the type hints there if you need them.

scarlehoff · 2025-03-26T18:25:10Z

validphys2/src/validphys/closuretest/multiclosure_nsigma_helpers.py

+
+    value = calc_chi2(sqrt_covmat, diff)
+    ndata = len(central_predictions)
+    return CentralChi2Data(value=value, ndata=ndata, dataset=dataset)


Test it because I'm not 100% sure (maybe you are doing something in the multiclosure that will break this) but in principle this function could just depend on abs_chi2_data which would automatically get the predictions, the data and the compute the chi2.

scarlehoff · 2025-03-26T18:26:06Z

validphys2/src/validphys/closuretest/multiclosure_nsigma_helpers.py

+
+    Returns
+    -------
+    str or None


It returns True/False

scarlehoff · 2025-03-26T18:27:59Z

validphys2/examples/true_pos_neg_closuretests.yaml

+  title: Inconsistency classification probabilities
+  author: Lazy Person
+  keywords: [nsigma, chi2, multiclosure test, inconsistent]
+


Add perhaps a comment saying that the commented dataset are those to reproduce the paper.

comane changed the title ~~[WIP] True Positive and Negative rates in multiclosure weighted fits~~ True Positive and Negative rates in multiclosure weighted fits Jan 18, 2025

comane requested review from scarlehoff and RoyStegeman January 18, 2025 19:06

comane force-pushed the nsigma_multiclosure_tpr_tnr branch from a3aed71 to a569fd6 Compare January 23, 2025 11:39

comane added the closure tests label Feb 9, 2025

comane added 22 commits March 26, 2025 15:12

added code for computation of TPR and TNR in a multiclosure test

bcbd7c5

added docs and TODO comments

ea56508

changed structure of tpr and tnr dataframe

16faf88

added Generator typing and use plotutils wrapper of matplotlib

aa4abcc

use api to get dataset label

b969ad0

added all mutliclosure nsigma modules

31973e9

added multiclosure nsigma modules to __init__ file

838b511

added docstrings to functions and moved some functions to helpers

c06fa51

added plotting function of prob inconsist consist

bf0be1c

added function for computation of probability of consistent/inconsist…

cc6b78d

…ent dataset

added docs to module

d178508

moved helper functions to helpers module

cac64dc

change plots with z_alpha as x-label

d0a7109

changed labels as in paper draft

edc4f7f

inverted labels

63e5cd5

changed labels and added different linestyle

7e61f62

removed unused module

b0a3ac3

added runcard to examples

cbdedc5

import from validphys rather than using sys

32bfe2a

use consistent notation with 2503.17447 paper

24891be

use consistent notation with 2503.17447 paper

060907f

removed hardcoded title

b46f020

comane force-pushed the nsigma_multiclosure_tpr_tnr branch from 93b23f0 to b46f020 Compare March 26, 2025 15:12

import CommonData from nnpdf_data

6d31329

scarlehoff reviewed Mar 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

True Positive and Negative rates in multiclosure weighted fits #2258

True Positive and Negative rates in multiclosure weighted fits #2258

comane commented Jan 17, 2025 •

edited

Loading

scarlehoff left a comment

scarlehoff Mar 26, 2025

scarlehoff Mar 26, 2025

scarlehoff Mar 26, 2025

scarlehoff Mar 26, 2025

scarlehoff Mar 26, 2025

scarlehoff Mar 26, 2025

scarlehoff Mar 26, 2025

scarlehoff Mar 26, 2025

scarlehoff Mar 26, 2025

scarlehoff Mar 26, 2025

True Positive and Negative rates in multiclosure weighted fits #2258

Are you sure you want to change the base?

True Positive and Negative rates in multiclosure weighted fits #2258

Conversation

comane commented Jan 17, 2025 • edited Loading

scarlehoff left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

comane commented Jan 17, 2025 •

edited

Loading