计算机工程 ›› 2018, Vol. 44 ›› Issue (11): 209-214,221.doi: 10.19678/j.issn.1000-3428.0048955

• 人工智能及识别技术 • 上一篇    下一篇

结合词性特征与卷积神经网络的文本情感分析

何鸿业,郑瑾,张祖平   

  1. 中南大学 信息科学与工程学院,长沙 410083
  • 收稿日期:2017-10-16 出版日期:2018-11-15 发布日期:2018-11-15
  • 作者简介:何鸿业(1993—),男,硕士研究生,主研方向为自然语言处理、深度学习;郑瑾,副教授;张祖平,教授
  • 基金项目:

    国家自然科学基金(61379109)

Text Sentiment Analysis Combined with Part of Speech Features and Convolutional Neural Network

HE Hongye,ZHENG Jin,ZHANG Zuping   

  1. School of Information Science and Engineering,Central South University,Changsha 410083,China
  • Received:2017-10-16 Online:2018-11-15 Published:2018-11-15

摘要:

在卷积神经网络模型中,如果输入文本表示不准确,网络训练容易因输入噪音导致过拟合。为改善文本卷积神经网络中输入文本表示的质量,构建一种结合词性特征的文本卷积神经网络模型。利用词性特征捕捉传统词向量无法识别的文本一词多义现象,并与输入文本原始表示方法相结合构造卷积神经网络的双通道输入。基于中文酒店评论和英文影评数据集的实验结果表明,相比于传统文本卷积神经网络,该模型在情感分类准确率、召回率和F1值等指标上均有明显提升。

关键词: 自然语言处理, 情感分析, 深度学习, 卷积神经网络, 文本表示

Abstract: In the Convolutional Neural Network(CNN)model,if the input text representation is not accurate,the network training is easy to lead to over-fitted due to the input noises inaccurate text.In order to improve the quality of text representation,Part of Speech(POS) features are utilized in this paper to capture polysemy phenomena of words which typical word embedding models are not sensitive to.Then,a dual-channel CNN model named Word-POS CNN(WP-CNN) is proposed in which the original text representation is enhanced by appending the POS features.According to the experimental results on Chinese hotel reviews and English movie reviews corpus,the proposed model can obviously get better precision,recall rate as well as F1-score in comparison with traditional text CNN models.

Key words: Natural Language Processing(NLP), sentiment analysis, deep learning, Convolutional Neural Network(CNN), text representation

中图分类号: