Domain Adaptation by Constraining Inter-Domain Variability of Latent Feature Representation

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We consider a semi-supervised setting for domain adaptation where only unlabeled data is available for the target domain. One way to tackle this problem is to train a generative model with latent variables on the mixture of data from the source and target domains. Such a model would cluster features in both domains and ensure that at least some of the latent variables are predictive of the label on the source domain. The danger is that these predictive clusters will consist of features specific to the source domain only and, consequently, a classifier relying on such clusters would perform badly on the target domain. We introduce a constraint enforcing that marginal distributions of each cluster (i.e., each latent variable) do not vary significantly across domains. We show that this constraint is effective on the sentiment classification task (Pang et al., 2002), resulting in scores similar to the ones obtained by the structural correspondence
methods (Blitzer et al., 2007) without the need to engineer auxiliary tasks.
Original languageEnglish
Title of host publicationThe 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19-24 June, 2011, Portland, Oregon, USA
PublisherAssociation for Computational Linguistics
Pages62-71
Number of pages10
ISBN (Print)978-1-932432-87-9
Publication statusPublished - 2011

Fingerprint Dive into the research topics of 'Domain Adaptation by Constraining Inter-Domain Variability of Latent Feature Representation'. Together they form a unique fingerprint.

Cite this