Gold Doesn’t Always Glitter: Spectral Removal of Linear and Nonlinear Guarded Attribute Information

Shun Shao, Yftah Ziser, Shay B Cohen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

We describe a simple and effective method (Spectral Attribute removaL; SAL) to remove private or guarded information from neural representations. Our method uses matrix decomposition to project the input representations into directions with reduced covariance with the guarded information rather than maximal covariance as factorization methods normally use. We begin with linear information removal and proceed to generalize our algorithm to the case of nonlinear information removal using kernels. Our experiments demonstrate that our algorithm retains better main task performance after removing the guarded information compared to previous work. In addition, our experiments demonstrate that we need a relatively small amount of guarded attribute data to remove information about these attributes, which lowers the exposure to sensitive data and is more suitable for low-resource scenarios.
Original languageEnglish
Title of host publicationProceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics
PublisherAssociation for Computational Linguistics
Pages1611–1622
Number of pages12
Publication statusPublished - 2 May 2023
EventThe 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023 - Dubrovnik, Croatia
Duration: 2 May 20236 May 2023
Conference number: 17
https://2023.eacl.org/

Conference

ConferenceThe 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023
Abbreviated titleEACL 2023
Country/TerritoryCroatia
CityDubrovnik
Period2/05/236/05/23
Internet address

Fingerprint

Dive into the research topics of 'Gold Doesn’t Always Glitter: Spectral Removal of Linear and Nonlinear Guarded Attribute Information'. Together they form a unique fingerprint.

Cite this