计算机工程 ›› 2018, Vol. 44 ›› Issue (11): 27-32,39.doi: 10.19678/j.issn.1000-3428.0048410

• 先进计算与数据处理 • 上一篇    下一篇

基于粒子群优化的朴素贝叶斯改进算法

邱宁佳,李娜,胡小娟,王鹏,孙爽滋   

  1. 长春理工大学 计算机科学技术学院,长春 130022
  • 收稿日期:2017-08-21 出版日期:2018-11-15 发布日期:2018-11-15
  • 作者简介:邱宁佳(1984—),男,讲师、博士,主研方向为数据挖掘;李娜,硕士研究生;胡小娟,讲师、博士;王鹏,副教授、博士;孙爽滋,副教授、硕士。
  • 基金项目:

    吉林省科技发展计划重点科技攻关项目(20150204036GX);吉林省省级产业创新专项资金(2017C051)

Improved Native Bayes Algorithm Based on Particle Swarm Optimization

QIU Ningjia,LI Na,HU Xiaojuan,WANG Peng,SUN Shuangzi   

  1. College of Computer Science and Technology,Changchun University of Science and Technology,Changchun 130022,China
  • Received:2017-08-21 Online:2018-11-15 Published:2018-11-15

摘要: 针对朴素贝叶斯(NB)算法因条件独立性的理想式假设引起分类性能降低的问题,提出一种改进的粒子群优化-朴素贝叶斯(PSO-NB)算法。在文本预处理时,引入权重因子、类内和类间离散因子进行属性约简,基于NB加权模型,将条件属性的词频比率作为其初始权值,利用PSO算法迭代寻找全局最优特征权向量,并以此权向量作为加权模型中各个特征词的权值生成分类器。运用经典数据集对PSO-NB算法进行性能分析,结果表明,改进算法可有效减少冗余属性,降低计算复杂度,具有较高的准确率和召回率。

关键词: 朴素贝叶斯, 互信息, 属性约简, 粒子群优化算法, 权值优化

Abstract: Aiming at the problem of classification performance degradation caused by the idealized assumption of conditional independence of Naive Bayes(NB) algorithm,an improved Particle Swarm Optimization-Native Bayes(PSO-NB) algorithm is proposed.In text preprocessing,weight factor,intra-class and inter-class discrete factors are introduced for attribute reduction.Based on NB weighted model,the word-frequency ratio of conditional attribute is used as its initial weight,and PSO algorithm is used to iteratively find global optimal feature weight vector.The vector is used as a weight value to generate a classifier for each feature word in the weighting model.The performance analysis of PSO-NB algorithm is done using classical dataset.Result shows that the improved algorithm can effectively reduce redundant attributes,reduce computational complexity,and has high accuracy and recall rate.

Key words: Native Bayes(NB), mutual information, attribute reduction, Particle Swarm Optimization(PSO) algorithm, weight optimization

中图分类号: