Query-based multi-documents summarization using linguistic knowledge and content word expansion

Seyedasadollah Abdiesfandani, Norisma Idris, Rasim M. Alguliyev, Ramiz M. Aliguliyev

Research output: Contribution to journalArticleAcademicpeer-review

26 Citations (Scopus)
19 Downloads (Pure)

Abstract

In this paper, a query-based summarization method, which uses a combination of semantic relations between words and their syntactic composition, to extract meaningful sentences from document sets is introduced. The problem with current statistical methods is that they fail to capture the meaning when comparing a sentence and a user query; hence there is often a conflict between the extracted sentences and users’ requirements. However, this particular method can improve the quality of document summaries because it is able to avoid extracting a sentence whose similarity with the query is high but whose meaning is different. The method is executed by computing the semantic and syntactic similarity of the sentence-to-sentence and sentence-to-query. To reduce redundancy in summary, this method uses the greedy algorithm to impose diversity penalty on the sentences. In addition, the proposed method expands the words in both the query and the sentences to tackle the problem of information limit. It bridges the lexical gaps for semantically similar contexts that are expressed using different wording. The experimental results display that the proposed method is able to improve performance compared with the participating systems in DUC 2006. The experimental results also showed that the proposed method demonstrates better performance as compared to other existing techniques on DUC 2005 and DUC 2006 datasets.

Original languageEnglish
Pages (from-to)1785–1801
JournalSoft computing
Volume21
DOIs
Publication statusPublished - 2017

Keywords

  • Query-based multi-document summarization
  • Graph-based sentence ranking
  • Query expansion
  • Extractive summarization
  • n/a OA procedure

Fingerprint

Dive into the research topics of 'Query-based multi-documents summarization using linguistic knowledge and content word expansion'. Together they form a unique fingerprint.

Cite this