作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2011, Vol. 37 ›› Issue (6): 21-23. doi: 10.3969/j.issn.1000-3428.2011.06.008

• 博士论文 • 上一篇    下一篇

多主题文本摘要抽取的研究与实现

廖 涛 1,2,刘宗田 2,王 利 2   

  1. (1. 安徽理工大学计算机科学与工程学院,安徽 淮南 232001; 2. 上海大学计算机工程与科学学院,上海 200072)
  • 出版日期:2011-03-20 发布日期:2011-03-29
  • 作者简介:廖 涛(1977-),男,讲师、博士研究生,主研方向:Web数据挖掘,文本分类;刘宗田,教授、博士生导师;王 利,硕士研究生
  • 基金资助:
    国家自然科学基金资助项目(60975033)

Research and Implementation of Summarization Extraction in Multiple Topics Document

LIAO Tao 1,2, LIU Zong-tian 2, WANG Li 2   

  1. (1. School of Computer Science and Engineering, Anhui University of Science and Technology, Huainan 232001, China; 2. School of Computer Engineering and Science, Shanghai University, Shanghai 200072, China)
  • Online:2011-03-20 Published:2011-03-29

摘要: 研究自动摘要技术,结合统计与文本关系图并基于复杂网络中的社区划分算法,提出一种多主题文本摘要抽取方法。抽取文本中权重较高的句子,通过句子的相似度计算建立文本关系图,利用社区划分算法解决子主题划分的问题。实验结果表明,该方法对多主题文本摘要的抽取质量较好,能抽取出较多的子主题。

关键词: 多主题文本, 自动摘要, 统计模型, 文本关系图, 子主题社区划分

Abstract: This paper researches the technology of automatic summarization, and presents a method that extracts summarization in multiple topics document, which combines statistics with text relationship map, and uses the algorithm of community partition in complex networks. This method extracts some sentences that have higher weight and builds text relationship map according to similarity computation of sentences, and then uses the algorithm of community partition in complex networks to solve the problem of sub-topic partition. Experimental results show that this method is more efficient for summarization extraction in multiple topics document and can extract more sub-topics.

Key words: multiple topics document, automatic summarization, statistical model, text relationship map, sub-topic community partition

中图分类号: