Abstract:
This paper proposes an improved decision tree classification algorithm based on naive Bayes algorithm and ID3 algorithm. It introduces objective attribute importance parameter, gives a kind of conditional independence assumption that is weaker than naive Bayesian algorithm, and uses the weighted independent information entropy as splitting attribute’s selection criteria. Theoretical analysis and experimental results show that the improved algorithm, to a certain extent well overcomes ID3 algorithm’s shortcoming of multi-value tendency, and improves algorithm’s implementation efficiency and classification accuracy.
Key words:
naive Bayesian algorithm,
ID3 algorithm,
information gain,
objective attribute importance,
conditional independence assumption,
weighted independent information entropy
摘要: v在朴素贝叶斯算法和ID3算法的基础上,提出一种改进的决策树分类算法。引入客观属性重要度参数,给出弱化的朴素贝叶斯条件独立性假设,并采用加权独立信息熵作为分类属性的选取标准。理论分析和实验结果表明,改进算法能在一定程度上克服ID3算法的多值偏向问题,并且具有较高的执行效率和分类准确度。
关键词:
朴素贝叶斯算法,
ID3算法,
信息增益,
客观属性重要度,
条件独立性假设,
加权独立信息熵
CLC Number:
HUANG Yu-Da, WANG Tuo-Dan. Decision Tree Classification Based on Naive Bayesian and ID3 Algorithm[J]. Computer Engineering, 2012, 38(14): 41-43.
黄宇达, 王迤冉. 基于朴素贝叶斯与ID3算法的决策树分类[J]. 计算机工程, 2012, 38(14): 41-43.