作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 人工智能及识别技术 • 上一篇    下一篇

基于随机主元分析算法的BBS情感分类研究

刘 林,刘三女牙,刘 智,铁 璐   

  1. (华中师范大学国家数字化学习工程技术研究中心,武汉430079)
  • 收稿日期:2013-03-04 出版日期:2014-05-15 发布日期:2014-05-14
  • 作者简介:刘 林(1983-),男,博士研究生,主研方向:情感识别,数据挖掘;刘三女牙,教授、博士;刘 智,博士研究生;铁 璐,硕士研究生。
  • 基金资助:
    国家“十二五”科技支撑计划基金资助项目(2011BAK08B03);新世纪优秀人才支持计划基金资助项目(NCET-11-0654);“核高基”重大专项(2010ZX01045-001-005);华中师范大学中央高校基本科研业务费专项基金资助项目(CCNU09A02006)。

Study on BBS Sentiment Classification Based on Random Principal Component Analysis Algorithm

LIU Lin, LIU San-ya, LIU Zhi, TIE Lu   

  1. (National Engineering Research Center for E-Learning, Central China Normal University, Wuhan 430079, China)
  • Received:2013-03-04 Online:2014-05-15 Published:2014-05-14

摘要: 针对论坛(BBS)中文本的情感分类问题,提出一种改进的随机子空间算法。挖掘特征空间中的分类信息,在生成子空间的过程中,利用权重函数对特征进行分类能力评估,以较大概率选择分类能力较好的特征维度,保证分类精度;扩大选择的子空间维度,选择具有分类能力的特征,通过主元分析对子空间进行降维,保证算法效率和子空间多样性。实验结果表明,该算法分类精度达到91.3%,比基准算法具有更好的性能稳定性。

关键词: 情感分析, 集成学习, 随机子空间方法, 主元分析, 支持向量机, 基分类器

Abstract: For Bulletin Board System(BBS) sentiment classification issues, an improved Random Subspace Method(RSM) is proposed. This method tries to make full use of the discriminative information in the high dimensional feature space. In the process of generating subspaces, on the one hand, a weighting function is used to evaluate classification abilities of the features, and better ones are chosen to ensure accuracy of classification with a higher probability, on the other hand, the size of the subspace is enlarged, principal component analysis is used to reduce the dimension of the subspace, and they ensure the efficiency and diversity. Experimental results show that the proposed algorithm obtains the best accuracy of 91.3% , which is higher than the conventional Random Subspace Method(RSM).

Key words: sentiment analysis, ensemble learning, Random Subspace Method(RSM), principal component analysis, Support Vector Machine(SVM), Base Classifier(BC)

中图分类号: