计算机工程

• 人工智能及识别技术 • 上一篇    下一篇

基于评价修饰分布差的评论文本倾向性识别方法

冯旭鹏1a,马震1a,谢波1a,刘利军1b,黄青松1b,2   

  1. (1.昆明理工大学 a.教育技术与网络中心; b.信息工程与自动化学院,昆明 650500;2.云南省计算机技术应用重点实验室,昆明 650500)
  • 收稿日期:2015-11-26 出版日期:2016-10-15 发布日期:2016-10-15
  • 作者简介:冯旭鹏(1986—),男,实验师、硕士、CCF会员,主研方向为机器学习、数据挖掘;马震,高级实验师;谢波,工程师;刘利军,讲师;黄青松(通讯作者),教授。
  • 基金项目:
    国家自然科学基金资助项目(81360230,81560296)。

Comment Text Orientation Identification Method Based on Evaluation Modification Distribution Difference

FENG Xupeng  1a,MA Zhen  1a,XIE Bo  1a,LIU Lijun  1b,HUANG Qingsong  1b,2   

  1. (1a.Educational Technology and Network Center; 1b.Faculty of Information Engineering and Automation, Kunming University of Science and Technology,Kunming 650500,China;2.Yunnan Key Laboratory of Computer Technology Applications,Kunming 650500,China)
  • Received:2015-11-26 Online:2016-10-15 Published:2016-10-15

摘要: 针对文本倾向性分类时因情感指向不明导致的修饰词极性误判和隐藏观点遗漏等问题,提出基于评价修饰分布差的倾向识别方法。建立修饰关系二部图和修饰分布向量,计算评价对象在正、负训练语料中被修饰词用于修饰的分布差异,提取修饰分布差异明显的特征,并将正、负修饰差异信息融入特征值的计算中。实验结果表明,相比抽取带有主观情感词作为特征进行支持向量机二类分类的倾向性识别方法,所提方法的分类准确率和召回率分别提高约4.6%和5.6%,可有效改善评论文本倾向性识别的效果。在面对跨领域情况时,分类准确率和召回率的降低幅度比抽取带有主观情感词作为特征进行支持向量机的二类分类减少约6.6%和6.4%,具有一定的领域适应性。

关键词: 文本倾向识别, 评价对象, 修饰关系, 分布差异, 领域适应

Abstract: In text orientation classification,there is a problem of adjunct words’ polarity misjudgment and missing hidden view caused by unknown emotional direction.This paper proposes an orientation identification method based on evaluation modification distribution difference.It builds modification relationship bipartite graph and modification distribution vector,computes the modification distribution difference of evaluation object in positive and negative training corpus,extracts the features which have obvious difference,and integrates the positive and negative difference information into the process of characteristic value computation.Experimental results show that the proposed method can improve the precision and recall by 4.6% and 5.6% respectively for text orientation identification compared with other approaches which tend to pick the strong subjective emotional items as features and classify by Support Vector Machine(SVM) algorithm.The method has domain adaptability that it can reduce the decline of precision and recall by 6.6% and 6.4% in the cross-domain situation,when compared with the approaches which pick the emotional items as features and classify by SVM algorithm.

Key words: text orientation identification, evaluation object, modification relationship, distribution difference, domain adaption

中图分类号: