作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2011, Vol. 37 ›› Issue (24): 25-27. doi: 10.3969/j.issn.1000-3428.2011.24.009

• 软件技术与数据库 • 上一篇    下一篇

支持实时增量更新的闭子树聚类算法

黄 伟,郭 鑫,周清平   

  1. (吉首大学信息管理与工程学院,湖南 张家界 427000)
  • 收稿日期:2011-05-19 出版日期:2011-12-20 发布日期:2011-12-20
  • 作者简介:黄 伟(1981-),男,讲师、硕士,主研方向:聚类算法,数据挖掘;郭 鑫,助教、硕士;周清平,教授、博士
  • 基金资助:
    吉首大学校级科研基金资助项目(11JD051);吉首大学教学改革研究基金资助项目(10JD043)

Closed Subtree Clustering Algorithm Supporting Real-time Incremental Update

HUANG Wei, GUO Xin, ZHOU Qing-ping   

  1. (School of Information Management and Engineering, Jishou University, Zhangjiajie 427000, China)
  • Received:2011-05-19 Online:2011-12-20 Published:2011-12-20

摘要: 现有的树聚类算法在树数据库实时更新后无法及时更新已有的聚类结果。为此,建立一种支持实时增量更新的闭子树聚类模型,以解决闭子树的增量聚类问题并提高聚类效率。针对树的半结构化特性,将结点语义和结点-边的结构特性结合在一起,提出一种准确率更高的树相似性度量方法,在此基础上,利用CTUM算法、TC算法和UTC算法,分别解决闭子树增量更新、聚类和增量聚类等问题。实验结果表明,该算法具有较高的运行效率和聚类准确率。

关键词: 聚类算法, 数据挖掘, 闭子树, 增量更新

Abstract: In real application environment, when tree database carries out live updating, the present tree-cluster algorithm can not update existing cluster result. In consideration of the semi-structured characteristics and the lower accuracy rate of similarity measurement of tree, this paper puts forward a similarity measuring method of combining knot semantics and structure feature of one side of knot. On that basis, it brings forth Closed Tree Update Mining (CTUM) algorithm, Tree Cluster(TC) algorithm and Update Tree Cluster(UTC) algorithm, which can separately solve problems of increment updating, clustering and increment clustering of close subtree. Experimental result proves that the novel algorithms are efficient and practicable to own higher executing efficiency and better cluster accuracy rate.

Key words: clustering algorithm, data mining, closed subtree, incremental update

中图分类号: