Spectral Unsupervised Parsing with Additive Tree Metrics

Ankur P. Parikh, Shay B. Cohen, Eric P. Xing

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We propose a spectral approach for unsupervised constituent parsing that comes with theoretical guarantees on latent structure recovery. Our approach is grammarless – we directly learn the bracketing structure of a given sentence without using a grammar model. The main algorithm is based on lifting the concept of additive tree metrics for structure learning of latent trees in the phylogenetic and machine learning communities to the case where the tree structure varies across examples. Although finding the “minimal” latent tree is NP-hard in general, for the case of projective trees we find that it can be found using bilexical parsing algorithms. Empirically, our algorithm performs favorably compared to the constituent context model of Klein and Manning (2002) without the need for careful initialization.
Original languageEnglish
Title of host publicationProceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Place of PublicationBaltimore, Maryland
PublisherAssociation for Computational Linguistics
Pages1062-1072
Number of pages11
Publication statusPublished - 1 Jun 2014

Fingerprint

Dive into the research topics of 'Spectral Unsupervised Parsing with Additive Tree Metrics'. Together they form a unique fingerprint.

Cite this