TY - JOUR
T1 - Calculating p-values and their significances with the Energy Test for large datasets
AU - Barter, W.
AU - Burr, C.
AU - Parkes, C.
PY - 2018/4/6
Y1 - 2018/4/6
N2 - The energy test method is a multi-dimensional test of whether two samples are consistent with arising from the same underlying population, through the calculation of a single test statistic (called the T-value). The method has recently been used in particle physics to search for samples that differ due to CP violation. The generalised extreme value function has previously been used to describe the distribution of T-values under the null hypothesis that the two samples are drawn from the same underlying population. We show that, in a simple test case, the distribution is not sufficiently well described by the generalised extreme value function. We present a new method, where the distribution of T-values under the null hypothesis when comparing two large samples can be found by scaling the distribution found when comparing small samples drawn from the same population. This method can then be used to quickly calculate the p-values associated with the results of the test.
AB - The energy test method is a multi-dimensional test of whether two samples are consistent with arising from the same underlying population, through the calculation of a single test statistic (called the T-value). The method has recently been used in particle physics to search for samples that differ due to CP violation. The generalised extreme value function has previously been used to describe the distribution of T-values under the null hypothesis that the two samples are drawn from the same underlying population. We show that, in a simple test case, the distribution is not sufficiently well described by the generalised extreme value function. We present a new method, where the distribution of T-values under the null hypothesis when comparing two large samples can be found by scaling the distribution found when comparing small samples drawn from the same population. This method can then be used to quickly calculate the p-values associated with the results of the test.
UR - http://dx.doi.org/10.1088/1748-0221/13/04/p04011
U2 - 10.1088/1748-0221/13/04/p04011
DO - 10.1088/1748-0221/13/04/p04011
M3 - Article
SN - 1748-0221
VL - 13
SP - 1
EP - 8
JO - Journal of Instrumentation
JF - Journal of Instrumentation
M1 - P04011
ER -