Acquiring Compact Lexicalized Grammars from a Cleaner Treebank

Julia Hockenmaier, Mark Steedman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

We present an algorithm which translates the Penn Treebank into a corpus of Combinatory Categorial Grammar (CCG) derivations. To do this we have needed to make several systematic changes to the Treebank which have to effect of cleaning up a number of errors and inconsistencies. This process has yielded a cleaner treebank that can potentially be used in any framework. We also show how unary type-changing rules for certain types of modifiers can be introduced in a CCG grammar to ensure a compact lexicon without augmenting the generative power of the system. We demonstrate how the combination of preprocessing and type-changing rules minimizes the lexical coverage problem.
Original languageEnglish
Title of host publicationProceedings of the Third International Conference on Language Resources and Evaluation, LREC 2002, May 29-31, 2002, Las Palmas, Canary Islands, Spain
Number of pages8
Publication statusPublished - 2002


Dive into the research topics of 'Acquiring Compact Lexicalized Grammars from a Cleaner Treebank'. Together they form a unique fingerprint.

Cite this