TY - UNPB
T1 - Fast, Piecewise Training for Discriminative Finite-state and Parsing Models
AU - Sutton, Charles
AU - McCallum, Andrew
PY - 2005
Y1 - 2005
N2 - Discriminitive models for sequences and trees---such as linear-chain conditional random fields (CRFs) and max-margin parsing---have shown great promise because they combine the ability to incorporate arbitrary input features and the benefits of principled global inference over their structured outputs. However, since parameter estimation in these models involves repeatedly performing this global inference, training can be very slow. We present piecewise training, a new training method that combines the speed of local training with the accuracy of global training by incorporating a limited amount of global information derived from previous errors of the model. On named-entity and part-of-speech data, we show that our new method not only trains in less than one-fifth the time of a CRF and yields improved accuracy over the MEMM, but surprisingly also provides a statistically-significant gain in accuracy over the CRF. Also, we present preliminary results showing a potential application to efficient training of discriminative parsers.
AB - Discriminitive models for sequences and trees---such as linear-chain conditional random fields (CRFs) and max-margin parsing---have shown great promise because they combine the ability to incorporate arbitrary input features and the benefits of principled global inference over their structured outputs. However, since parameter estimation in these models involves repeatedly performing this global inference, training can be very slow. We present piecewise training, a new training method that combines the speed of local training with the accuracy of global training by incorporating a limited amount of global information derived from previous errors of the model. On named-entity and part-of-speech data, we show that our new method not only trains in less than one-fifth the time of a CRF and yields improved accuracy over the MEMM, but surprisingly also provides a statistically-significant gain in accuracy over the CRF. Also, we present preliminary results showing a potential application to efficient training of discriminative parsers.
M3 - Working paper
T3 - Center for Intelligent Information Retrieval Technical Reports
BT - Fast, Piecewise Training for Discriminative Finite-state and Parsing Models
PB - Center for Intelligent Information Retrieval
ER -