This site uses cookies. By continuing to use this site you agree to our use of cookies. To find out more, see our Privacy and Cookies policy.

Improved inference for the signal significance

and

Published 6 December 2018 © 2018 IOP Publishing Ltd and Sissa Medialab
, , Citation I. Volobouev and A. Trindade 2018 JINST 13 P12011 DOI 10.1088/1748-0221/13/12/P12011

1748-0221/13/12/P12011

Abstract

We study the properties of several likelihood-based statistics commonly used in testing for the presence of a known signal under a mixture model with known background, but unknown signal fraction. Under the null hypothesis of no signal, all statistics follow a standard normal distribution in large samples, but substantial deviations can occur at practically relevant sample sizes. Approximations for respective p-values are derived to various orders of accuracy using the methodology of Edgeworth expansions. Adherence to normality is studied, and the magnitude of deviations is quantified according to resulting p-value inflation or deflation. We find that approximations to third-order accuracy are generally sufficient to guarantee p-values with nominal false positive error rates in the 5σ range (p-value =2.87 × 10−7) for the classic Wald, score, and likelihood ratio (LR) statistics at relatively low sample sizes. Not only does LR have better adherence to normality, but it also consistently outperforms all other statistics in terms of false negative error rates. The reasons for this are shown to be connected with high-order cumulant behavior gleaned from fourth order Edgeworth expansions. Finally, a conservative procedure is suggested for making finite sample adjustments while accounting for the look elsewhere effect via the theory of random fields.

Export citation and abstract BibTeX RIS

10.1088/1748-0221/13/12/P12011