作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2007, Vol. 33 ›› Issue (08): 180-181,. doi: 10.3969/j.issn.1000-3428.2007.08.063

• 人工智能及识别技术 • 上一篇    下一篇

一种主题句发现的中文自动文摘研究

王 萌1,李春贵1,唐培和1,王晓荣2   

  1. (1. 广西工学院计算机工程系,柳州 545006;2. 华中师范大学计算机科学系,武汉 430079)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-04-20 发布日期:2007-04-20

Chinese Automatic Summarization Based on Thematic Sentence Discovery

WANG Meng1, LI Chungui1, TANG Peihe1, WANG Xiaorong2   

  1. ( 1. Department of Computer Engineering, Guangxi University of Technology, Liuzhou 545006; 2. Department of Computer Science, Central China Normal University, Wuhan 430079)
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-04-20 Published:2007-04-20

摘要: 提出了一种基于主题句发现的中文自动文摘方法。该方法使用术语代替传统的词语作为最小语义单位,采用术语长度术语频率方法进行术语权重计算,获得特征词。利用一种改进的k-means聚类算法进行句子聚类,根据聚类结果进行主题句发现。实验表明,该算法所得到的文摘,在各项指标上优于传统的文摘。

关键词: 主题句发现, 自动文摘, 句子聚类, 自然语言处理

Abstract: Automatic summarization is one of main research fields in natural language processing. This paper proposes a special Chinese automatic summarization method based on discovering thematic sentences, which uses terms as minimal semantic unit rather than word, and employs term length term frequency (TLTF) to compute weight of term to obtain feature. It uses an improved k-means method to cluster sentences, and discovers thematic sentences according to clustering results. Experimental results indicate a clear superiority of the proposed method over the traditional method under the proposed evaluation scheme.

Key words: Thematic sentence discovery, Automatic text summarization, Sentences clustering, Natural language processing

中图分类号: