TY - JOUR
T1 - Patterns of item nonresponse behaviour to survey questionnaires are systematic and associated with genetic loci
AU - Mignogna, Gianmarco
AU - Carey, Caitlin E
AU - Wedow, Robbee
AU - Baya, Nikolas
AU - Cordioli, Mattia
AU - Pirastu, Nicola
AU - Bellocco, Rino
AU - Malerbi, Kathryn Fiuza
AU - Nivard, Michel G
AU - Neale, Benjamin M
AU - Walters, Raymond K
AU - Ganna, Andrea
N1 - © 2023. The Author(s).
PY - 2023/6/29
Y1 - 2023/6/29
N2 - Response to survey questionnaires is vital for social and behavioural research, and most analyses assume full and accurate response by participants. However, nonresponse is common and impedes proper interpretation and generalizability of results. We examined item nonresponse behaviour across 109 questionnaire items in the UK Biobank (N = 360,628). Phenotypic factor scores for two participant-selected nonresponse answers, 'Prefer not to answer' (PNA) and 'I don't know' (IDK), each predicted participant nonresponse in follow-up surveys (incremental pseudo-R2 = 0.056), even when controlling for education and self-reported health (incremental pseudo-R2 = 0.046). After performing genome-wide association studies of our factors, PNA and IDK were highly genetically correlated with one another (rg = 0.73 (s.e. = 0.03)) and with education (rg,PNA = -0.51 (s.e. = 0.03); rg,IDK = -0.38 (s.e. = 0.02)), health (rg,PNA = 0.51 (s.e. = 0.03); rg,IDK = 0.49 (s.e. = 0.02)) and income (rg,PNA = -0.57 (s.e. = 0.04); rg,IDK = -0.46 (s.e. = 0.02)), with additional unique genetic associations observed for both PNA and IDK (P < 5 × 10-8). We discuss how these associations may bias studies of traits correlated with item nonresponse and demonstrate how this bias may substantially affect genome-wide association studies. While the UK Biobank data are deidentified, we further protected participant privacy by avoiding exploring non-response behaviour to single questions, assuring that no information can be used to associate results with any particular respondents.
AB - Response to survey questionnaires is vital for social and behavioural research, and most analyses assume full and accurate response by participants. However, nonresponse is common and impedes proper interpretation and generalizability of results. We examined item nonresponse behaviour across 109 questionnaire items in the UK Biobank (N = 360,628). Phenotypic factor scores for two participant-selected nonresponse answers, 'Prefer not to answer' (PNA) and 'I don't know' (IDK), each predicted participant nonresponse in follow-up surveys (incremental pseudo-R2 = 0.056), even when controlling for education and self-reported health (incremental pseudo-R2 = 0.046). After performing genome-wide association studies of our factors, PNA and IDK were highly genetically correlated with one another (rg = 0.73 (s.e. = 0.03)) and with education (rg,PNA = -0.51 (s.e. = 0.03); rg,IDK = -0.38 (s.e. = 0.02)), health (rg,PNA = 0.51 (s.e. = 0.03); rg,IDK = 0.49 (s.e. = 0.02)) and income (rg,PNA = -0.57 (s.e. = 0.04); rg,IDK = -0.46 (s.e. = 0.02)), with additional unique genetic associations observed for both PNA and IDK (P < 5 × 10-8). We discuss how these associations may bias studies of traits correlated with item nonresponse and demonstrate how this bias may substantially affect genome-wide association studies. While the UK Biobank data are deidentified, we further protected participant privacy by avoiding exploring non-response behaviour to single questions, assuring that no information can be used to associate results with any particular respondents.
U2 - 10.1038/s41562-023-01632-7
DO - 10.1038/s41562-023-01632-7
M3 - Article
C2 - 37386106
SN - 2397-3374
JO - Nature Human Behaviour
JF - Nature Human Behaviour
ER -