The standard training method of Conditional Random Fields (CRFs) is very slow for large-scale applications. As an alternative, piecewise training divides the full graph into pieces, trains them independently, and combines the learned weights at test time. In this paper, we present separate training for undirected models based on the novel Co-occurrence Rate Factorization (CR-F). Separate training is a local training method. In contrast to piecewise training, separate training is exact. In contrast to MEMMs, separate training is unaffected by the label bias problem. Experiments show that separate training (i) is unaffected by the label bias problem; (ii) reduces the training time from weeks to seconds; and (iii) obtains competitive results to the standard and piecewise training on linear-chain CRFs.
|Place of Publication||Enschede|
|Publisher||Centre for Telematics and Information Technology (CTIT)|
|Number of pages||10|
|Publication status||Published - 1 Oct 2012|
|Name||CTIT Technical Report Series|
|Publisher||Centre for Telematics and Information Technology, University of Twente|
- Conditional random fields
- undirected graph factorization
- natural language processing
Zhu, Z., Hiemstra, D., Apers, P. M. G., & Wombacher, A. (2012). Separate Training for Conditional Random Fields Using Co-occurrence Rate Factorization. (CTIT Technical Report Series; No. TR-CTIT-12-29). Enschede: Centre for Telematics and Information Technology (CTIT).