Co-occurrence rate networks: towards separate training for undirected graphical models

Zhemin Zhu

Abstract

Dependence is a universal phenomenon which can be observed everywhere. In machine learning, probabilistic graphical models (PGMs) represent dependence relations with graphs. PGMs find wide applications in natural language processing (NLP), speech processing, computer vision, biomedicine, information retrieval, etc. Many traditional models, such as hidden Markov models (HMMs), Kalman filters, can be put under the umbrella of PGMs. The central idea of PGMs is to decompose (factorize) a joint probability into a product of local factors. Learning, inference and storage can be conducted efficiently over the factorization representation. In this thesis, we propose a novel framework motivated by the Minimum Shared Information Principle (MSIP): We try to find a factorization in which the information shared between factors is minimum. In other words, we try to make factors as independent as possible. The benefit by doing this is that we can train factors separately without paying a lot of efforts to guarantee consistency between them. To achieve this goal, we develop a theoretical framework called co-occurrence rate networks (CRNs) to obtain such a factorization. Experimental results on three important natural language processing tasks show that our separate training method is two orders of magnitude faster than conditional random fields, while achieving competitive quality (often better on the overall quality metric F1). The second contribution of this thesis is applying PGMs to a real-world NLP application: open relation extraction (ORE). In open relation extraction, two entities in a sentence are given, and the goal is to automatically extract their relation expression. ORE is a core technique, especially in the age of big data, for transforming unstructured information into structured data. We propose our model SimpleIE for this task. The basic idea is to decompose an extraction pattern into a sequence of simplification operations (components). The benefit by doing this is that these components can be re-combined in a new way to generate new extraction patterns. Experimental results on three benchmark data sets show that SimpleIE boosts recall and F1 by at least 15% comparing with seven ORE systems.
Original languageUndefined
Awarding Institution
  • University of Twente
Supervisors/Advisors
  • Apers, Peter M.G., Supervisor
  • Hiemstra, Djoerd , Advisor
Sponsors
Date of Award16 Oct 2015
Place of PublicationEnschede, The Netherlands
Publisher
Print ISBNs978-90-365-3932-6
DOIs
StatePublished - 16 Oct 2015

Fingerprint

Factorization
Speech processing
Hidden Markov models
Information retrieval
Kalman filters
Computer vision
Learning systems

Keywords

  • EWI-26925
  • IR-97338
  • METIS-311922

Cite this

Zhu, Zhemin. / Co-occurrence rate networks: towards separate training for undirected graphical models. Enschede, The Netherlands : Universiteit Twente, 2015. 178 p.
@misc{5dd84efd8ca24c05ae675a4202435d82,
title = "Co-occurrence rate networks: towards separate training for undirected graphical models",
abstract = "Dependence is a universal phenomenon which can be observed everywhere. In machine learning, probabilistic graphical models (PGMs) represent dependence relations with graphs. PGMs find wide applications in natural language processing (NLP), speech processing, computer vision, biomedicine, information retrieval, etc. Many traditional models, such as hidden Markov models (HMMs), Kalman filters, can be put under the umbrella of PGMs. The central idea of PGMs is to decompose (factorize) a joint probability into a product of local factors. Learning, inference and storage can be conducted efficiently over the factorization representation. In this thesis, we propose a novel framework motivated by the Minimum Shared Information Principle (MSIP): We try to find a factorization in which the information shared between factors is minimum. In other words, we try to make factors as independent as possible. The benefit by doing this is that we can train factors separately without paying a lot of efforts to guarantee consistency between them. To achieve this goal, we develop a theoretical framework called co-occurrence rate networks (CRNs) to obtain such a factorization. Experimental results on three important natural language processing tasks show that our separate training method is two orders of magnitude faster than conditional random fields, while achieving competitive quality (often better on the overall quality metric F1). The second contribution of this thesis is applying PGMs to a real-world NLP application: open relation extraction (ORE). In open relation extraction, two entities in a sentence are given, and the goal is to automatically extract their relation expression. ORE is a core technique, especially in the age of big data, for transforming unstructured information into structured data. We propose our model SimpleIE for this task. The basic idea is to decompose an extraction pattern into a sequence of simplification operations (components). The benefit by doing this is that these components can be re-combined in a new way to generate new extraction patterns. Experimental results on three benchmark data sets show that SimpleIE boosts recall and F1 by at least 15% comparing with seven ORE systems.",
keywords = "EWI-26925, IR-97338, METIS-311922",
author = "Zhemin Zhu",
note = "SIKS Dissertation Series No. 2015-22",
year = "2015",
month = "10",
doi = "10.3990/1.9789036539326",
isbn = "978-90-365-3932-6",
publisher = "Universiteit Twente",
school = "University of Twente",

}

Co-occurrence rate networks: towards separate training for undirected graphical models. / Zhu, Zhemin.

Enschede, The Netherlands : Universiteit Twente, 2015. 178 p.

Research output: ScientificPhD Thesis - Research UT, graduation UT

TY - THES

T1 - Co-occurrence rate networks: towards separate training for undirected graphical models

AU - Zhu,Zhemin

N1 - SIKS Dissertation Series No. 2015-22

PY - 2015/10/16

Y1 - 2015/10/16

N2 - Dependence is a universal phenomenon which can be observed everywhere. In machine learning, probabilistic graphical models (PGMs) represent dependence relations with graphs. PGMs find wide applications in natural language processing (NLP), speech processing, computer vision, biomedicine, information retrieval, etc. Many traditional models, such as hidden Markov models (HMMs), Kalman filters, can be put under the umbrella of PGMs. The central idea of PGMs is to decompose (factorize) a joint probability into a product of local factors. Learning, inference and storage can be conducted efficiently over the factorization representation. In this thesis, we propose a novel framework motivated by the Minimum Shared Information Principle (MSIP): We try to find a factorization in which the information shared between factors is minimum. In other words, we try to make factors as independent as possible. The benefit by doing this is that we can train factors separately without paying a lot of efforts to guarantee consistency between them. To achieve this goal, we develop a theoretical framework called co-occurrence rate networks (CRNs) to obtain such a factorization. Experimental results on three important natural language processing tasks show that our separate training method is two orders of magnitude faster than conditional random fields, while achieving competitive quality (often better on the overall quality metric F1). The second contribution of this thesis is applying PGMs to a real-world NLP application: open relation extraction (ORE). In open relation extraction, two entities in a sentence are given, and the goal is to automatically extract their relation expression. ORE is a core technique, especially in the age of big data, for transforming unstructured information into structured data. We propose our model SimpleIE for this task. The basic idea is to decompose an extraction pattern into a sequence of simplification operations (components). The benefit by doing this is that these components can be re-combined in a new way to generate new extraction patterns. Experimental results on three benchmark data sets show that SimpleIE boosts recall and F1 by at least 15% comparing with seven ORE systems.

AB - Dependence is a universal phenomenon which can be observed everywhere. In machine learning, probabilistic graphical models (PGMs) represent dependence relations with graphs. PGMs find wide applications in natural language processing (NLP), speech processing, computer vision, biomedicine, information retrieval, etc. Many traditional models, such as hidden Markov models (HMMs), Kalman filters, can be put under the umbrella of PGMs. The central idea of PGMs is to decompose (factorize) a joint probability into a product of local factors. Learning, inference and storage can be conducted efficiently over the factorization representation. In this thesis, we propose a novel framework motivated by the Minimum Shared Information Principle (MSIP): We try to find a factorization in which the information shared between factors is minimum. In other words, we try to make factors as independent as possible. The benefit by doing this is that we can train factors separately without paying a lot of efforts to guarantee consistency between them. To achieve this goal, we develop a theoretical framework called co-occurrence rate networks (CRNs) to obtain such a factorization. Experimental results on three important natural language processing tasks show that our separate training method is two orders of magnitude faster than conditional random fields, while achieving competitive quality (often better on the overall quality metric F1). The second contribution of this thesis is applying PGMs to a real-world NLP application: open relation extraction (ORE). In open relation extraction, two entities in a sentence are given, and the goal is to automatically extract their relation expression. ORE is a core technique, especially in the age of big data, for transforming unstructured information into structured data. We propose our model SimpleIE for this task. The basic idea is to decompose an extraction pattern into a sequence of simplification operations (components). The benefit by doing this is that these components can be re-combined in a new way to generate new extraction patterns. Experimental results on three benchmark data sets show that SimpleIE boosts recall and F1 by at least 15% comparing with seven ORE systems.

KW - EWI-26925

KW - IR-97338

KW - METIS-311922

U2 - 10.3990/1.9789036539326

DO - 10.3990/1.9789036539326

M3 - PhD Thesis - Research UT, graduation UT

SN - 978-90-365-3932-6

PB - Universiteit Twente

ER -

Zhu Z. Co-occurrence rate networks: towards separate training for undirected graphical models. Enschede, The Netherlands: Universiteit Twente, 2015. 178 p. Available from, DOI: 10.3990/1.9789036539326