Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering

Previous Articles     Next Articles

Sentiment Analysis of Chinese Micro-blog Based on Topic

WEI Hang,WANG Yongheng   

  1. (School of Information Science and Engineering,Hunan University,Changsha 410082,China)
  • Received:2014-07-30 Online:2015-09-15 Published:2015-09-15

基于主题的中文微博情感分析

韦航,王永恒   

  1. (湖南大学信息科学与工程学院,长沙 410082)
  • 作者简介:韦航(1990-),女,硕士研究生,主研方向:文本分析,数据挖掘;王永恒,讲师、博士。
  • 基金资助:
    国家自然科学基金资助项目(61371116);湖南省自然科学基金资助项目(13JJ3046)。

Abstract: Micro-blog attracts a large number of users to publish and share opinions on it,making it an important data resource for opinion mining and sentiment analysis.The traditional methods always ignore structured semantic information,which leads to the low accuracy.They also tend to ignore the topic of the sentimental expressions and adopt the topic-independent strategy,which results in some mistakes.This paper proposes a method of pruning the syntax tree to implement the topic-dependent sentiment analysis.It uses the convolution kernel of Support Vector Machine(SVM) to obtain the structured information from syntax tree,and adopts the topic-dependent syntax pruning according to the domain ontology and syntactic paths library,then eliminates the inference of irrelevant appraisal expressions.Experimental results on two corpus with different topics show that the accuracy can reach 86.6% and 86.0%.

Key words: Chinese micro-blog, sentiment analysis, syntax tree, tree kernel function, pruning strategy, Support Vector Machine(SVM)

摘要: 传统的微博情感分析一般忽略结构化的语义信息,使得分类准确率不高,同时还忽略情感表达的具体对象,以与主题无关的形式进行情感分析,容易造成错误的分析结果。为此,采用对语法树进行剪枝的方法实现基于主题的情感分析,使用支持向量机中的卷积树核函数获取语法树结构化特征,通过建立本体和句法路径库对语法树进行基于主题的剪枝,去除无关评价的干扰。实验结果表明,该方法在2个不同主题的数据集上准确率分别达到86.6%和86.0%。

关键词: 中文微博, 情感分析, 语法树, 树核函数, 剪枝策略, 支持向量机

CLC Number: