作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2010, Vol. 36 ›› Issue (11): 75-77. doi: 10.3969/j.issn.1000-3428.2010.11.027

• 软件技术与数据库 • 上一篇    下一篇

基于粗糙集的决策树构造算法

丁春荣1,李龙澍2,杨宝华1   

  1. (1. 安徽农业大学信息与计算机学院,合肥 230036;2. 安徽大学计算机科学与技术学院,合肥 230039)
  • 出版日期:2010-06-05 发布日期:2010-06-05
  • 作者简介:丁春荣(1975-),女,讲师、硕士,主研方向:数据挖掘,粗糙集;李龙澍,教授、博士生导师;杨宝华,副教授、硕士
  • 基金资助:

    国家自然科学基金资助项目(60273043);安徽省高校省级自然科学基金资助项目(KJ2007B158)

Decision Tree Constructing Algorithm Based on Rough Set

DING Chun-rong1, LI Long-shu2, YANG Bao-hua1   

  1. (1. School of Information and Computer, Anhui Agricultural University, Hefei 230036; 2. School of Computer Science and Technology, Anhui University, Hefei 230039)
  • Online:2010-06-05 Published:2010-06-05

摘要:

针对ID3算法构造决策树复杂、分类效率不高问题,基于粗糙集理论提出一种决策树构造算法。该算法采用加权分类粗糙度作为节点选择属性的启发函数,与信息增益相比,能全面地刻画属性分类的综合贡献能力,并且计算简单。为消除噪声对选择属性和生成叶节点的影响,利用变精度粗糙集模型对该算法进行优化。实验结果表明,该算法构造的决策树在规模与分类效率上均优于ID3算法。

关键词: 数据挖掘, 粗糙集, 可变精度粗糙集, 决策树, 加权分类粗糙度

Abstract:

Aiming at the problems of complex and low efficiency decision tree constructed by ID3, this paper proposes a decision tree classification algorithm based on rough set, which takes the weighted classification rough degree as the heuristic function of choosing attribute at a node. This heuristic function can synthetically measure contribution of an attribute for classification, and is simple in calculation. To eliminate the effect of noise data on choosing attributes and generating leaf nodes, a method using variable precision rough set model is used to optimize the algorithm. Experimental results show that the size of trees generated by the new algorithm is smaller and higher accuracy than ID3 algorithm.

Key words: data mining, rough set, variable precision rough set, decision tree, weighted classification roughness

中图分类号: