作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2006, Vol. 32 ›› Issue (11): 209-210,218.

• 人工智能及识别技术 • 上一篇    下一篇

一种基于遗传算法的主题划分方法

傅间莲,陈群秀   

  1. 清华大学计算机系智能技术与系统国家重点实验室,北京 100084
  • 出版日期:2006-06-05 发布日期:2006-06-05

Study on Topic Partition Based on Genetic Algorithm

FU Jianlian,CHEN Qunxiu   

  1. State Key Lab of Intelligent Technology and System, Department of Computer Science and Technology, Tsinghua University, Beijing 100084
  • Online:2006-06-05 Published:2006-06-05

摘要: 提出了一个通过建立段落向量空间模型,根据遗传算法进行文本主题划分的算法,解决了文章的篇章结构分析问题,使得多主题文章的文摘更具内容全面性与结构平衡性。实验结果表明,该算法对多主题文章的主题划分准确率为89.3%,对单主题文章的主题划分准确率为94.6%。

关键词: 自动文摘;向量空间模型;遗传算法;主题划分

Abstract: This paper establishes VSM for the whole article based on paragraph, then proposes an idea for multi-topic text partitioning based on GA. It solves the problem of chapter structural analysis in multi-topic article and makes the abstract of the multi-topic to have more general content and more balanced structure. The experiment on close test shows that the precision of topic partition for multi-topic text and single-topic text reaches 89.3% and 94.6% respectively.

Key words: Automatic abstraction; Vector space model; Genetic algorithm(GA); Topic segmentation