It is common practice in the statistical analysis of phonetic data to draw conclusions on the basis of statistical significance. While p-values reflect the probability of incorrectly concluding a null effect is real, they do not provide information about other types of error that are also important for interpreting statistical results. In this paper, we focus on three measures related to these errors. The first, power, reflects the likelihood of detecting an effect that in fact exists. The second and third, Type M and Type S errors, measure the extent to which estimates of the magnitude and direction of an effect are inaccurate. We then provide an example of design analysis (Gelman & Carlin, 2014), using data from an experimental study on German incomplete neutralization, to illustrate how power, magnitude, and sign errors vary with sample and effect size. This case study shows how the informativity of research findings can vary substantially in ways that are not always, or even usually, apparent on the basis of a p-value alone. We conclude by repeating three recommendations for good statistical practice in phonetics from best practices widely recommended for the social and behavioral sciences: report all results; design studies which will produce high-precision estimates; and conduct direct replications of previous findings.
- effect size
- design analysis
- incomplete neutralization
FingerprintDive into the research topics of 'Mixed-effects design analysis for experimental phonetics'. Together they form a unique fingerprint.
Person: Academic: Research Active