作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2010, Vol. 36 ›› Issue (4): 77-78. doi: 10.3969/j.issn.1000-3428.2010.04.027

• 软件技术与数据库 • 上一篇    下一篇

一种改进的CAIM算法

李 慧,闫德勤,张迎春   

  1. LI Hui, YAN De-qin, ZHANG Ying-chun
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2010-02-20 发布日期:2010-02-20

Modified Algorithm of CAIM

LI Hui, YAN De-qin, ZHANG Ying-chun   

  1. (School of Computer and Information Technology, Liaoning Normal University, Dalian 116081)
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-02-20 Published:2010-02-20

摘要: 在CAIM算法中,离散判别式仅考虑了区间中最多的类与属性间的依赖度,使离散化过度而导致结果不精确。基于此,提出对CAIM的改进算法,该算法考虑到按属性重要性从小到大顺序进行离散,同时根据粗糙集理论提出条件属性可分辨率概念,与近似精度同时控制信息表最终的离散程度,有效解决了离散化过度问题。实验通过C4.5和支持向量机分别对离散化后的数据进行识别和分类预测,结果证明了该算法的有效性。

关键词: 连续属性离散化, 粗糙集, 属性可分辨率

Abstract: In Class-Attribute Interdependency Maximization(CAIM) algorithm, discretization criterion only accounts for the trend of maximizing the number of values belonging to a leading class within each interval. The disadvantage makes CAIM generate irrational discrete results and further leads to the decrease of predictive accuracy of a classifier. This paper proposes a modified algorithm of CAIM. With the algorithm, the importance of attributes is adopted in discretization process, and a concept of attribute discernibility rate is proposed based on rough set. Both attribute discernibility rate and approximate quality are used for discretization intervals, which effectively resolve the problem of over-discretization. By using C4.5 and SVM, experiments are performed respectively with the results of discreted data, which show that the presented algorithm is effective.

Key words: discretization of continuous attributes, rough set, attribute discernibility rate

中图分类号: