Piecewise training for structured prediction

Charles Sutton, Andrew McCallum

Research output: Contribution to journalArticlepeer-review

Abstract

A drawback of structured prediction methods is that parameter estimation requires repeated inference, which is intractable for general structures. In this paper, we present an approximate training algorithm called piecewise training (PW) that divides the factors into tractable subgraphs, which we call pieces, that are trained independently. Piecewise training can be interpreted as approximating the exact likelihood using belief propagation, and different ways of making this interpretation yield different insights into the method. We also present an extension to piecewise training, called piecewise pseudolikelihood (PWPL), designed for when variables have large cardinality. On several real-world natural language processing tasks, piecewise training performs superior to Besag's pseudolikelihood and sometimes comparably to exact maximum likelihood. In addition, PWPL performs similarly to PW and superior to standard pseudolikelihood, but is five to ten times more computationally efficient than batch maximum likelihood training.
Original languageEnglish
Pages (from-to)165-194
Number of pages30
JournalMachine Learning
Volume77
Issue number2-3
DOIs
Publication statusPublished - Dec 2009

Keywords

  • Graphical models
  • Conditional random fields
  • Local training
  • Belief propagation

Fingerprint Dive into the research topics of 'Piecewise training for structured prediction'. Together they form a unique fingerprint.

Cite this