How Data Drive Early Word Learning: A Cross-Linguistic Waiting Time Analysis

Francis Mollica, Steven T. Piantadosi

Research output: Contribution to journalArticlepeer-review


The extent to which word learning is delayed by maturation as opposed to accumulating data is a longstanding question in language acquisition. Further, the precise way in which data influence learning on a large scale is unknown?experimental results reveal that children can rapidly learn words from single instances as well as by aggregating ambiguous information across multiple situations. We analyze Wordbank, a large cross-linguistic dataset of word acquisition norms, using a statistical waiting time model to quantify the role of data in early language learning, building off Hidaka (2013). We find that the model both fits and accurately predicts the shape of children?s growth curves. Further analyses of model parameters suggest a primarily data-driven account of early word learning. The parameters of the model directly characterize both the amount of data required and the rate at which informative data occurs. With high statistical certainty, words require on the order of ? 10 learning instances, which occur on average once every two months. Our method is extremely simple, statistically principled, and broadly applicable to modeling data-driven learning effects in development.
Original languageEnglish
Pages (from-to)67-77
Number of pages11
JournalOpen Mind: Discoveries in Cognitive Science (Open Mind)
Issue number2
Publication statusPublished - 13 Sep 2017

Fingerprint Dive into the research topics of 'How Data Drive Early Word Learning: A Cross-Linguistic Waiting Time Analysis'. Together they form a unique fingerprint.

Cite this