作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2010, Vol. 36 ›› Issue (5): 81-83. doi: 10.3969/j.issn.1000-3428.2010.05.030

• 软件技术与数据库 • 上一篇    下一篇

基于语义密度的文本聚类研究

刘金岭   

  1. (淮阴工学院计算机系,淮安 223003)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2010-03-05 发布日期:2010-03-05

Study on Text Clustering Based on Semantic Density

LIU Jin-ling   

  1. (Department of Computer, Huaiyin Institute of Technology, Huaian 223003)
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-03-05 Published:2010-03-05

摘要: 结合文本数据的语义相似度,给出一种基于语义密度文本数据聚类的方法。根据文本数据的特点,从一个随机选定的文本对象出发,向文本数据最为密集的区域扩张,组织成一个能反映语料结构的有序序列进行聚类。在处理噪声文本数据的过程中,利用有效结果重组策略来辅助噪声文本数据重新定位。实验结果表明,该方法具有良好的聚类性能。

关键词: 密度, 簇, 邻域, 聚类

Abstract: Combined with semantic similarity of text data, this paper gives a method of text data clustering based on semantic density. According to the characteristics of text data, from a randomly selected text object, it expands towards the most intensive area of the text data, organizes into a structure to reflect the corpus in an orderly sequence, and then clusters. In dealing with noise text data, it uses the results of the reorganization of an effective strategy to support the re-positioning noise text data. Experimental results show that the method has good clustering performance.

Key words: density, cluster, neighborhood, clustering

中图分类号: