Projects per year
We present a cognitive model of early lexical acquisition which jointly performs word segmentation and learns an explicit model of phonetic variation. We define the model as a Bayesian noisy channel; we sample segmentations and word forms simultaneously from the posterior, using beam sampling to control the size of the search space. Compared to a pipelined approach in which segmentation is performed first, our model is qualitatively more similar to human learners. On data with variable pronunciations, the pipelined approach learns to treat syllables or morphemes as words. In contrast, our joint model, like infant learners, tends to learn multiword collocations. We also conduct analyses of the phonetic variations that the model learns to accept and its patterns of word recognition errors, and relate these to developmental evidence.
|Title of host publication||Proceedings of the Conference on Empirical Methods in Natural Language Processing|
|Publisher||Association for Computational Linguistics|
|Number of pages||13|
|Publication status||Published - 2013|
FingerprintDive into the research topics of 'A Joint Learning Model of Word Segmentation, Lexical Acquisition, and Phonetic Variability'. Together they form a unique fingerprint.
- 1 Finished
Word segmentation from noisy data with minimal supervision.
24/01/11 → 23/04/14