Using multi-rater and test-retest data to detect overlap within and between psychological scales

Sam Henry*, Dustin Wood, David M. Condon, Graham H. Lowman, René Mõttus

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract / Description of output

Correlations estimated in single-source data provide uninterpretable estimates of empirical overlap between scales. We describe a model to adjust correlations for errors and biases using test–retest and multi-rater data and compare adjusted correlations among individual items with their human-rated semantic similarity (SS). We expected adjusted correlations to predict SS better than unadjusted correlations and exceed SS in absolute magnitude. While unadjusted and adjusted correlations predicted SS rankings equally well across all items, adjusted correlations were superior where items were judged most semantically redundant in meaning. Retest- and agreement-adjusted correlations were usually higher than SS, whereas unadjusted correlations often underestimated SS. We discuss uses of test–retest and multi-rater data for identifying construct redundancy and argue SS often underestimates variables’ empirical overlap.
Original languageEnglish
Article number104530
JournalJournal of Research in Personality
Volume113
Early online date1 Sept 2024
DOIs
Publication statusPublished - Dec 2024

Keywords / Materials (for Non-textual outputs)

  • cross-rater agreement
  • item-level analysis
  • jingle-jangle
  • reliability
  • semantic similarity

Fingerprint

Dive into the research topics of 'Using multi-rater and test-retest data to detect overlap within and between psychological scales'. Together they form a unique fingerprint.

Cite this