Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2018, Vol. 44 ›› Issue (11): 27-32,39. doi: 10.19678/j.issn.1000-3428.0048410

Previous Articles     Next Articles

Improved Native Bayes Algorithm Based on Particle Swarm Optimization

QIU Ningjia,LI Na,HU Xiaojuan,WANG Peng,SUN Shuangzi   

  1. College of Computer Science and Technology,Changchun University of Science and Technology,Changchun 130022,China
  • Received:2017-08-21 Online:2018-11-15 Published:2018-11-15

基于粒子群优化的朴素贝叶斯改进算法

邱宁佳,李娜,胡小娟,王鹏,孙爽滋   

  1. 长春理工大学 计算机科学技术学院,长春 130022
  • 作者简介:邱宁佳(1984—),男,讲师、博士,主研方向为数据挖掘;李娜,硕士研究生;胡小娟,讲师、博士;王鹏,副教授、博士;孙爽滋,副教授、硕士。
  • 基金资助:

    吉林省科技发展计划重点科技攻关项目(20150204036GX);吉林省省级产业创新专项资金(2017C051)

Abstract: Aiming at the problem of classification performance degradation caused by the idealized assumption of conditional independence of Naive Bayes(NB) algorithm,an improved Particle Swarm Optimization-Native Bayes(PSO-NB) algorithm is proposed.In text preprocessing,weight factor,intra-class and inter-class discrete factors are introduced for attribute reduction.Based on NB weighted model,the word-frequency ratio of conditional attribute is used as its initial weight,and PSO algorithm is used to iteratively find global optimal feature weight vector.The vector is used as a weight value to generate a classifier for each feature word in the weighting model.The performance analysis of PSO-NB algorithm is done using classical dataset.Result shows that the improved algorithm can effectively reduce redundant attributes,reduce computational complexity,and has high accuracy and recall rate.

Key words: Native Bayes(NB), mutual information, attribute reduction, Particle Swarm Optimization(PSO) algorithm, weight optimization

摘要: 针对朴素贝叶斯(NB)算法因条件独立性的理想式假设引起分类性能降低的问题,提出一种改进的粒子群优化-朴素贝叶斯(PSO-NB)算法。在文本预处理时,引入权重因子、类内和类间离散因子进行属性约简,基于NB加权模型,将条件属性的词频比率作为其初始权值,利用PSO算法迭代寻找全局最优特征权向量,并以此权向量作为加权模型中各个特征词的权值生成分类器。运用经典数据集对PSO-NB算法进行性能分析,结果表明,改进算法可有效减少冗余属性,降低计算复杂度,具有较高的准确率和召回率。

关键词: 朴素贝叶斯, 互信息, 属性约简, 粒子群优化算法, 权值优化

CLC Number: