TY - JOUR
T1 - Bootstrapping word boundaries: A bottom-up corpus-based approach to speech segmentation
AU - Cairns, Paul
AU - Shillcock, Richard
AU - Chater, Nick
AU - Levy, Joe
PY - 1997/7
Y1 - 1997/7
N2 - Speech is continuous, and isolating meaningful chunks for lexical access is a nontrivial
problem. In this paper we use neural network models and more conventional
statistics to study the use of sequential phonological probabilities in the segmentation
of an idealized phonological transcription of the London–Lund Corpus; these speech
data are representative of genuine conversational English. We demonstrate, first, that
the distribution of phonetic segments in English is an important cue to segmentation,
and, second, that the distributional information is such that it might allow the infant,
beginning with only a sensitivity to the statistics of subsegmental primitives, to bootstrap
into a series of increasingly sophisticated segmentation competences, ending
with an adult competence. We discuss the relation between the behavior of the models
and existing psycholinguistic studies of speech segmentation. In particular, we confirm
the utility of the Metrical Segmentation Strategy (Cutler & Norris, 1988) and demonstrate
a route by which this utility might be recognized by the infant, without requiring
the prior specification of categories like ‘‘syllable’’ or ‘‘strong syllable.’’
AB - Speech is continuous, and isolating meaningful chunks for lexical access is a nontrivial
problem. In this paper we use neural network models and more conventional
statistics to study the use of sequential phonological probabilities in the segmentation
of an idealized phonological transcription of the London–Lund Corpus; these speech
data are representative of genuine conversational English. We demonstrate, first, that
the distribution of phonetic segments in English is an important cue to segmentation,
and, second, that the distributional information is such that it might allow the infant,
beginning with only a sensitivity to the statistics of subsegmental primitives, to bootstrap
into a series of increasingly sophisticated segmentation competences, ending
with an adult competence. We discuss the relation between the behavior of the models
and existing psycholinguistic studies of speech segmentation. In particular, we confirm
the utility of the Metrical Segmentation Strategy (Cutler & Norris, 1988) and demonstrate
a route by which this utility might be recognized by the infant, without requiring
the prior specification of categories like ‘‘syllable’’ or ‘‘strong syllable.’’
U2 - 10.1006/cogp.1997.0649
DO - 10.1006/cogp.1997.0649
M3 - Article
VL - 33
SP - 111
EP - 153
JO - Cognitive Psychology
JF - Cognitive Psychology
SN - 0010-0285
IS - 2
ER -