Separate training for conditional random fields using co-occurrence rate factorization

Zhemin Zhu, Djoerd Hiemstra, Peter M.G. Apers, Andreas Wombacher

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademic

61 Downloads (Pure)


Conditional Random Fields (CRFs) are undirected graphical models which are well suited to many natural language processing (NLP) tasks, such part-of-speech (POS) tagging and named entity recognition (NER). The standard training method of CRFs can be very slow for large-scale applications. As an alternative to the standard training method, piecewise training divides the full graph into pieces, trains them independently, and combines the learned weights at test time. But piecewise training does not scale well in the variable cardinality. In this paper we present separate training for undirected models based on the novel Co-occurrence Rate factorization (CR- F). Separate training is a local training method without global propagation. In contrast to directed markov models such as MEMMs, separate training is unaff ected by the label bias problem even it is a local normalized method. We do experiments on two NLP tasks, i.e., POS tagging and NER. Results show that separate training (i) is unaffected by the label bias problem; (ii) reduces the training time from weeks to seconds; and (iii) obtains competitive results to the standard and piecewise training on linear-chain CRFs. Separate training is a promising technique for scaling undirected models for natural language processing tasks. (More details can be found here:
Original languageUndefined
Title of host publicationProceedings of the 23rd Meeting of Computational Linguistics in the Netherlands, CLIN 2013
Place of PublicationEnschede
PublisherUniversity of Twente
Number of pages1
ISBN (Print)not assigned
Publication statusPublished - Jan 2013
Event23rd Meeting of Computational Linguistics in the Netherlands, CLIN 2013 - Enschede, the Netherlands
Duration: 18 Jan 201318 Jan 2013

Publication series

PublisherUniversity of Twente


Conference23rd Meeting of Computational Linguistics in the Netherlands, CLIN 2013
Other18 January 2013


  • EWI-23379
  • IR-86473
  • METIS-297659

Cite this