Abstract:
For Bulletin Board System(BBS) sentiment classification issues, an improved Random Subspace Method(RSM) is proposed. This method tries to make full use of the discriminative information in the high dimensional feature space. In the process of generating subspaces, on the one hand, a weighting function is used to evaluate classification abilities of the features, and better ones are chosen to ensure accuracy of classification with a higher probability, on the other hand, the size of the subspace is enlarged, principal component analysis is used to reduce the dimension of the subspace, and they ensure the efficiency and diversity. Experimental results show that the proposed algorithm obtains the best accuracy of 91.3% , which is higher than the conventional Random Subspace Method(RSM).
Key words:
sentiment analysis,
ensemble learning,
Random Subspace Method(RSM),
principal component analysis,
Support Vector Machine(SVM),
Base Classifier(BC)
摘要: 针对论坛(BBS)中文本的情感分类问题,提出一种改进的随机子空间算法。挖掘特征空间中的分类信息,在生成子空间的过程中,利用权重函数对特征进行分类能力评估,以较大概率选择分类能力较好的特征维度,保证分类精度;扩大选择的子空间维度,选择具有分类能力的特征,通过主元分析对子空间进行降维,保证算法效率和子空间多样性。实验结果表明,该算法分类精度达到91.3%,比基准算法具有更好的性能稳定性。
关键词:
情感分析,
集成学习,
随机子空间方法,
主元分析,
支持向量机,
基分类器
CLC Number:
LIU Lin, LIU San-ya, LIU Zhi, TIE Lu. Study on BBS Sentiment Classification Based on Random Principal Component Analysis Algorithm[J]. Computer Engineering.
刘林,刘三女牙,刘智,铁璐. 基于随机主元分析算法的BBS情感分类研究[J]. 计算机工程.