Skip to content

Commit

Permalink
minor highlighting fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
vincecr0ft committed Apr 10, 2018
1 parent 959174e commit 2d73d33
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions docs/source/docs-freq1.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,13 +25,13 @@ Various improvements and feature requests are also work packages for the statis
Background-only p-values for Searches
-------------------------------------

In the case of searches, we are interested in calculating a background-only p-value. Typically we start with the statistical model :math:`f(data | \mu, \alpha)`, where :math:`\mu` is proportional to the signal cross-section (e.g. a signal yield or signal strength) and :math:`\alpha` are the nuisance parameters. The appropriate test statistic :math:`\tilde{q}_0` (as defined in Ref.~\cite{Cowan:2010js}), with increasing values indicating more events than expected in the background-only hypothesis, is implemented with `ProfileLikelihoodRatioTestStat`. The background-only p-value, denoted :math:`p_0`, can be calculated using toy Monte Carlo with RooStats `FrequentistCalculator` or by using the asymptotic formulae with the RooStats ``AsymptoticCalculator``.
In the case of searches, we are interested in calculating a background-only p-value. Typically we start with the statistical model :math:`f(data | \mu, \alpha)`, where :math:`\mu` is proportional to the signal cross-section (e.g. a signal yield or signal strength) and :math:`\alpha` are the nuisance parameters. The appropriate test statistic :math:`\tilde{q}_0` (as defined in Ref.~\cite{Cowan:2010js}), with increasing values indicating more events than expected in the background-only hypothesis, is implemented with ``ProfileLikelihoodRatioTestStat``. The background-only p-value, denoted :math:`p_0`, can be calculated using toy Monte Carlo with RooStats ``FrequentistCalculator`` or by using the asymptotic formulae with the RooStats ``AsymptoticCalculator``.
The subtleties associated to the treatment of global observables and nuisance parameters described above in the upper-limit section also apply here.

Measurements and Confidence Intervals / Parameter Contours
----------------------------------------------------------

The extended recommendations in Ref.~\cite{ExtendedRecommendations}. are aimed at measurement problems (68\% and 95\% confidence intervals in a single parameter or multiple parameters with or without physical boundaries). The document is a natural extension on the search/upper-limit recommendations. The primary difference is to change the test statistic to :math:`t_\mu` (as defined in Ref.~\cite{Cowan:2010js}), whcihis appropriate for measurements instead of 1-sided tests. This test statistic is also implemented with the RooStats `ProfileLikelihoodRatioTestStat`. As above, the p-values can be calculated either with asymptotic or toy Monte Carlo; however, there are improvements needed and planned in this case.
The extended recommendations in Ref.~\cite{ExtendedRecommendations}. are aimed at measurement problems (68\% and 95\% confidence intervals in a single parameter or multiple parameters with or without physical boundaries). The document is a natural extension on the search/upper-limit recommendations. The primary difference is to change the test statistic to :math:`t_\mu` (as defined in Ref.~\cite{Cowan:2010js}), whcihis appropriate for measurements instead of 1-sided tests. This test statistic is also implemented with the RooStats ``ProfileLikelihoodRatioTestStat``. As above, the p-values can be calculated either with asymptotic or toy Monte Carlo; however, there are improvements needed and planned in this case.


Improvements Needed/Planned
Expand All @@ -40,9 +40,9 @@ Improvements Needed/Planned

The ``HypoTestInverter`` currently only supports 1-d problems. The `NeymanConstruction` and `FeldmanCousins` classes support N-D problems (with boundaries), but is not as configurable as the ``HypoTestInverter`` class. The planning in RooStats is to unify these tools. Note, the RooStats ``FrequentistCalculator`` can calculate p-values using toy Monte Carlo and the recommended treatment of global observables and nuisance parameters even in multi-dimensional cases with complex boundaries -- the missing part is not the p-value calculation, but the scan over the parameter space and the actual hypothesis test inversion. Efficient scanning becomes increasingly important for multidimensional problems.

The `HypoTestInverter` can also be configured to use the `AsymptoticCalculator` to calculate p-values more quickly. The `AsymptoticCalculator` only has the 1-d case with a lower-boundary implemented. The 1-d case with upper- and lower-boundaries has been worked out~\cite{Cowan:2012se} and should be implemented as well.
The ``HypoTestInverter`` can also be configured to use the ``AsymptoticCalculator`` to calculate p-values more quickly. The ``AsymptoticCalculator`` only has the 1-d case with a lower-boundary implemented. The 1-d case with upper- and lower-boundaries has been worked out~\cite{Cowan:2012se} and should be implemented as well.

For multi-dimensional problems, p-value based on toy MC can become quite time consuming. In many cases the asymptotic approach is sufficiently accurate and much faster. The presence of boundaries modifies the asymptotic distributions; however, in general this depends on the shape of the boundary which means there will be no general formulae. It is possible that one can find a formulae for the asymptotic distribution for simple boundaries (e.g. or :math:`\mu_1>0`, :math:`\mu_2>0`, or :math:`\mu_1>0 \&\& \mu_2>0` ). Neglecting these modifications to the boundary leads to over-coverage and some protection to the sensitivity problem near the boundary similar to CLs, thus the current recommendations are to use the uncorrected :math:`\chi^2_n` distribution and make this clear in the text of the paper. Thus, the tools needed for the asymptotic procedure for this non-calibrated procedure are already in place with RooStats `ProfileLikelihoodRatioTestStat` and the standard :math:`\chi^2_n` cutoffs for 68\% and 95\% confidence intervals (and have been used in recent Higgs property papers).
For multi-dimensional problems, p-value based on toy MC can become quite time consuming. In many cases the asymptotic approach is sufficiently accurate and much faster. The presence of boundaries modifies the asymptotic distributions; however, in general this depends on the shape of the boundary which means there will be no general formulae. It is possible that one can find a formulae for the asymptotic distribution for simple boundaries (e.g. or :math:`\mu_1>0`, :math:`\mu_2>0`, or :math:`\mu_1>0 \&\& \mu_2>0` ). Neglecting these modifications to the boundary leads to over-coverage and some protection to the sensitivity problem near the boundary similar to CLs, thus the current recommendations are to use the uncorrected :math:`\chi^2_n` distribution and make this clear in the text of the paper. Thus, the tools needed for the asymptotic procedure for this non-calibrated procedure are already in place with RooStats ``ProfileLikelihoodRatioTestStat`` and the standard :math:`\chi^2_n` cutoffs for 68\% and 95\% confidence intervals (and have been used in recent Higgs property papers).

Diagnostics are important for all statistical methods, particularly for complicated problems.
There are a number tools that have been developed that are in use by the physics groups,
Expand Down

0 comments on commit 2d73d33

Please sign in to comment.