Closed form maximum likelihood estimator of conditional random fields

Zhemin Zhu, Djoerd Hiemstra, Peter M.G. Apers, Andreas Wombacher

Research output: Book/ReportReportProfessional

55 Downloads (Pure)

Abstract

Training Conditional Random Fields (CRFs) can be very slow for big data. In this paper, we present a new training method for CRFs called {\em Empirical Training} which is motivated by the concept of co-occurrence rate. We show that the standard training (unregularized) can have many maximum likelihood estimations (MLEs). Empirical training has a unique closed form MLE which is also a MLE of the standard training. We are the first to identify the \emph{Test Time Problem} of the standard training which may lead to low accuracy. Empirical training is immune to this problem. Empirical training is also unaffected by the label bias problem even it is locally normalized. All of these have been verified by experiments. Experiments also show that empirical training reduces the training time from weeks to seconds, and obtains competitive results to the standard and piecewise training on linear-chain CRFs, especially when data are insufficient.
Original languageUndefined
Place of PublicationEnschede
PublisherCentre for Telematics and Information Technology (CTIT)
Publication statusPublished - 12 Feb 2013

Publication series

NameCTIT Technical Report Series
PublisherCentre for Telematics and Information Technology, University of Twente
No.TR-CTIT-13-03
ISSN (Print)1381-3625

Keywords

  • EWI-23097
  • IR-84386
  • METIS-296313

Cite this

Zhu, Z., Hiemstra, D., Apers, P. M. G., & Wombacher, A. (2013). Closed form maximum likelihood estimator of conditional random fields. (CTIT Technical Report Series; No. TR-CTIT-13-03). Enschede: Centre for Telematics and Information Technology (CTIT).
Zhu, Zhemin ; Hiemstra, Djoerd ; Apers, Peter M.G. ; Wombacher, Andreas. / Closed form maximum likelihood estimator of conditional random fields. Enschede : Centre for Telematics and Information Technology (CTIT), 2013. (CTIT Technical Report Series; TR-CTIT-13-03).
@book{6eec86d04e144dd39957ef4e0dba4e1a,
title = "Closed form maximum likelihood estimator of conditional random fields",
abstract = "Training Conditional Random Fields (CRFs) can be very slow for big data. In this paper, we present a new training method for CRFs called {\em Empirical Training} which is motivated by the concept of co-occurrence rate. We show that the standard training (unregularized) can have many maximum likelihood estimations (MLEs). Empirical training has a unique closed form MLE which is also a MLE of the standard training. We are the first to identify the \emph{Test Time Problem} of the standard training which may lead to low accuracy. Empirical training is immune to this problem. Empirical training is also unaffected by the label bias problem even it is locally normalized. All of these have been verified by experiments. Experiments also show that empirical training reduces the training time from weeks to seconds, and obtains competitive results to the standard and piecewise training on linear-chain CRFs, especially when data are insufficient.",
keywords = "EWI-23097, IR-84386, METIS-296313",
author = "Zhemin Zhu and Djoerd Hiemstra and Apers, {Peter M.G.} and Andreas Wombacher",
year = "2013",
month = "2",
day = "12",
language = "Undefined",
series = "CTIT Technical Report Series",
publisher = "Centre for Telematics and Information Technology (CTIT)",
number = "TR-CTIT-13-03",
address = "Netherlands",

}

Zhu, Z, Hiemstra, D, Apers, PMG & Wombacher, A 2013, Closed form maximum likelihood estimator of conditional random fields. CTIT Technical Report Series, no. TR-CTIT-13-03, Centre for Telematics and Information Technology (CTIT), Enschede.

Closed form maximum likelihood estimator of conditional random fields. / Zhu, Zhemin; Hiemstra, Djoerd; Apers, Peter M.G.; Wombacher, Andreas.

Enschede : Centre for Telematics and Information Technology (CTIT), 2013. (CTIT Technical Report Series; No. TR-CTIT-13-03).

Research output: Book/ReportReportProfessional

TY - BOOK

T1 - Closed form maximum likelihood estimator of conditional random fields

AU - Zhu, Zhemin

AU - Hiemstra, Djoerd

AU - Apers, Peter M.G.

AU - Wombacher, Andreas

PY - 2013/2/12

Y1 - 2013/2/12

N2 - Training Conditional Random Fields (CRFs) can be very slow for big data. In this paper, we present a new training method for CRFs called {\em Empirical Training} which is motivated by the concept of co-occurrence rate. We show that the standard training (unregularized) can have many maximum likelihood estimations (MLEs). Empirical training has a unique closed form MLE which is also a MLE of the standard training. We are the first to identify the \emph{Test Time Problem} of the standard training which may lead to low accuracy. Empirical training is immune to this problem. Empirical training is also unaffected by the label bias problem even it is locally normalized. All of these have been verified by experiments. Experiments also show that empirical training reduces the training time from weeks to seconds, and obtains competitive results to the standard and piecewise training on linear-chain CRFs, especially when data are insufficient.

AB - Training Conditional Random Fields (CRFs) can be very slow for big data. In this paper, we present a new training method for CRFs called {\em Empirical Training} which is motivated by the concept of co-occurrence rate. We show that the standard training (unregularized) can have many maximum likelihood estimations (MLEs). Empirical training has a unique closed form MLE which is also a MLE of the standard training. We are the first to identify the \emph{Test Time Problem} of the standard training which may lead to low accuracy. Empirical training is immune to this problem. Empirical training is also unaffected by the label bias problem even it is locally normalized. All of these have been verified by experiments. Experiments also show that empirical training reduces the training time from weeks to seconds, and obtains competitive results to the standard and piecewise training on linear-chain CRFs, especially when data are insufficient.

KW - EWI-23097

KW - IR-84386

KW - METIS-296313

M3 - Report

T3 - CTIT Technical Report Series

BT - Closed form maximum likelihood estimator of conditional random fields

PB - Centre for Telematics and Information Technology (CTIT)

CY - Enschede

ER -

Zhu Z, Hiemstra D, Apers PMG, Wombacher A. Closed form maximum likelihood estimator of conditional random fields. Enschede: Centre for Telematics and Information Technology (CTIT), 2013. (CTIT Technical Report Series; TR-CTIT-13-03).