Abstract:
This paper proposes a multi-document summarization method based on spectral clustering. Based on clustering topic-relevant sentences in the documents together, this method creatively takes the importance of each class into consideration, along with sentence position, length and other factors to obtain the score of importance of the sentences. The sentences are sorted according to the score and extracted that meet the requirement of number of words as the summarization. Experimental results show that this method performs better than traditional methods and can improve the quality of summarization effectively.
Key words:
multi-document summarization,
spectral clustering,
information retrieval
摘要:
提出一种基于谱聚类的多文档摘要方法。在将文档中主题相关的句子进行聚类的基础上,同时考虑不同主题类别的重要性,综合句子位置、长度等因素以得到句子的重要性得分。根据重要性从高到低抽取满足字数要求的句子作为最终摘要。实验结果表明,该方法相较于传统摘要方法有更好的性能,能够有效地提高摘要的质量。
关键词:
多文档摘要,
谱聚类,
信息检索
CLC Number:
LIN Li, HU Xia, SHU Dun-Pan. Novel Multi-document Summarization Method Based on Spectral Clustering[J]. Computer Engineering, 2010, 36(22): 64-65.
林立, 胡侠, 朱俊彦. 基于谱聚类的多文档摘要新方法[J]. 计算机工程, 2010, 36(22): 64-65.