Abstract
Text summarization is a process for creating a concise version of document(s) preserving its main content. In this paper, to cover all topics and reduce redundancy in summaries, a two-stage sentences selection method for text summarization is proposed. At the first stage, to discover all topics the sentences set is clustered by using k-means method. At the second stage, optimum selection of sentences is proposed. From each cluster the salient sentences are selected according to their contribution to the topic (cluster) and their proximity to other sentences in cluster to avoid redundancy in summaries until the appointed summary length is reached. Sentence selection is modeled as an optimization problem. In this study, to solve the optimization problem an adaptive differential evolution with novel mutation strategy is employed. With a test on benchmark DUC2001 and DUC2002 data sets, the ROUGE value of summaries got by the proposed approach demonstrated its validity, compared to the traditional methods of sentence selection and the top three performing systems for DUC2001 and DUC2002.
Original language | English |
---|---|
Number of pages | 19 |
Journal | International Journal of Intelligent Information Technologies |
Volume | 13 |
Issue number | 1 |
DOIs | |
Publication status | Published - 2017 |
Externally published | Yes |