Abstract:
This paper researches the technology of automatic summarization, and presents a method that extracts summarization in multiple topics document, which combines statistics with text relationship map, and uses the algorithm of community partition in complex networks. This method extracts some sentences that have higher weight and builds text relationship map according to similarity computation of sentences, and then uses the algorithm of community partition in complex networks to solve the problem of sub-topic partition. Experimental results show that this method is more efficient for summarization extraction in multiple topics document and can extract more sub-topics.
Key words:
multiple topics document,
automatic summarization,
statistical model,
text relationship map,
sub-topic community partition
摘要: 研究自动摘要技术,结合统计与文本关系图并基于复杂网络中的社区划分算法,提出一种多主题文本摘要抽取方法。抽取文本中权重较高的句子,通过句子的相似度计算建立文本关系图,利用社区划分算法解决子主题划分的问题。实验结果表明,该方法对多主题文本摘要的抽取质量较好,能抽取出较多的子主题。
关键词:
多主题文本,
自动摘要,
统计模型,
文本关系图,
子主题社区划分
CLC Number:
LIAO Chao, LIU Zong-Tian, WANG Li. Research and Implementation of Summarization Extraction in Multiple Topics Document[J]. Computer Engineering, 2011, 37(6): 21-23.
廖涛, 刘宗田, 王利. 多主题文本摘要抽取的研究与实现[J]. 计算机工程, 2011, 37(6): 21-23.