作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2011, Vol. 37 ›› Issue (22): 35-38. doi: 10.3969/j.issn.1000-3428.2011.22.009

• 软件技术与数据库 • 上一篇    下一篇

基于属性与对象关系信息的综合差异度计算

高学东,吴玲玉,武 森,谷淑娟   

  1. (北京科技大学经济管理学院,北京 100083)
  • 收稿日期:2011-05-18 出版日期:2011-11-18 发布日期:2011-11-20
  • 作者简介:高学东(1963-),男,教授、博士生导师,主研方向:空间聚类,数据挖掘;吴玲玉,博士研究生;武 森,教授、博士生导师;谷淑娟,博士研究生
  • 基金资助:
    国家自然科学基金资助项目(70771007)

Synthesized Difference Degree Calculation Based on Attribute and Object Relation Information

GAO Xue-dong, WU Ling-yu, WU Sen, GU Shu-juan   

  1. (School of Economics and Management, University of Science & Technology Beijing, Beijing 100083, China)

  • Received:2011-05-18 Online:2011-11-18 Published:2011-11-20

摘要: 传统聚类算法仅考虑属性相似性,较少利用对象间的相互关系。为此,通过关系信息属性化操作,将关系数据转化为关系型属性数据,提出一种针对关系型属性的差异度计算方法。在此基础上,规范化属性变量中的区间和序数变量,将分类变量转变为二态变量,关系变量视为二态变量,提出一种兼顾属性与对象间关系信息的综合差异度计算方法。理论分析和实例结果表明,基于该差异度的聚类准确度更高,聚类结果的实用性更强。

关键词: 数据挖掘, 属性空间聚类, 关系型属性, 综合差异度

Abstract: Traditional clustering method for attribute space ignores the object relationship information. In order to improve it, transforming the relationship into special attribute named relation attribute, a method for computing the dissimilarity for the relation attribute is raised. After changing interval and ordinal variables into standard interval variable, changing categorical variable into binary variable, regarding relation variable as binary variable, a method for computing the synthesized difference degree by considering attribute and object relation information. Theoretical analysis and a clustering example result show that the clustering accuracy degree is higher based on the difference degree, and the clustering result is more practical.

Key words: data mining, attribute space clustering, relational attribute, synthesized difference degree

中图分类号: