摘要: 引出了纯区间的概念后,提出了一种基于纯区间归约的数值型属性处理方法对SPRINT算法进行改进。该方法将属性值域用等宽直方图的方法划分为多个区间,对纯区间进行归约,对非纯区间进行精确计算,保证了分裂精度,减小了计算量。
关键词:
决策树,
SPRINT算法,
纯区间归约,
Gini指数
Abstract: This paper introduces the concept of pure interval, proposes a new splitting method based on pure intervals reduction to deal with numeric attributes for SPRINT algorithm. The method divides the numeric attributes to many intervals with equal-width histogram, reduces the pure intervals, calculates exactly the minimum gini value in the impure intervals, ensures the accuracy of split result and reduces computation.
Key words:
Decision tree,
SPRINT algorithm,
Pure intervals reduction,
Gini index
中图分类号:
刘友军; 汪林林. SPRINT算法的改进[J]. 计算机工程, 2006, 32(16): 55-57.
LIU Youjun;WANG Linlin. Improvement of SPRINT Algorithm[J]. Computer Engineering, 2006, 32(16): 55-57.