This paper presents an incremental probabilistic learner that models the acquistion of syntax and semantics from a corpus of child-directed utterances paired with possible representations of their meanings. These meaning representations approximate the contextual input available to the child; they do not specify the meanings of individual words or syntactic derivations. The learner then has to infer the meanings and syntactic properties of the words in the input along with a parsing model. We use the CCG grammatical framework and train a non-parametric Bayesian model of parse structure with online variational Bayesian expectation maximization. When tested on utterances from the CHILDES corpus, our learner outperforms a state-of-the-art semantic parser. In addition, it models such aspects of child acquisition as "fast mapping", while also countering previous criticisms of statistical syntactic learners.
|Title of host publication||Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics|
|Place of Publication||Avignon, France|
|Publisher||Association for Computational Linguistics|
|Number of pages||11|
|Publication status||Published - 1 Apr 2012|