Encoding Prior Knowledge with Eigenword Embeddings

Dominique Osborne, Shashi Narayan, Shay B. Cohen

Research output: Contribution to journalArticlepeer-review

Abstract

Canonical correlation analysis (CCA) is a method for reducing the dimension of data represented using two views. It has been previously used to derive word embeddings, where one view indicates a word, and the other view indicates its context. We describe a way to incorporate prior knowledge into CCA, give a theoretical justification for it, and test it by deriving word embeddings and evaluating them on a myriad of datasets.
Original languageEnglish
Pages (from-to)417-430
Number of pages14
JournalTransactions of the Association for Computational Linguistics
Volume4
Publication statusPublished - 1 Jul 2016

Cite this