Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering

Previous Articles     Next Articles

Unsupervised Sentiment Orientation Analysis on Micro-blog Based on Biterm Topic Model

ZHANG Jiaming,WANG Bo,TANG Haohao,LI Tiancai   

  1. (School of Information and System Engineering,PLA Information Engineering University,Zhengzhou 450001,China)
  • Received:2014-07-07 Online:2015-07-15 Published:2015-07-15

基于Biterm主题模型的无监督微博情感倾向性分析

张佳明,王波,唐浩浩,李天彩   

  1. (解放军信息工程大学信息系统工程学院,郑州 450001)
  • 作者简介:张佳明(1989-),男,硕士研究生,主研方向:情感分析;王波,副教授;唐浩浩、李天彩,硕士研究生。
  • 基金资助:
    国家“863”计划基金资助项目(2011AA7032030D);国家部委基金资助项目。

Abstract: Sentiment orientation analysis on microblog has become a research hotspot in current academic circles.Unsupervised methods based on traditional topic models fail to resolve the problem of feature sparsity of microblog corpus,which turns in poor performance in sentiment orientation analysis on microblog.To solve this problem,this paper presents an unsupervised method for sentiment orientation analysis on microblog based on Biterm Topic Model(BTM).The corpus is preprocessed and the co-occurrence words pairs are counted.BTM model is used in the method to mine the implicit topics in the documents.A sentiment dictionary is used to calculate the sentiment distributions of the topics.The sentiment orientation of the whole microblog is obtained on the basis of the sentiment distributions of the topics.Experimental results conducted on NLP&CC2012 corpus show that the proposed method can more effectively identify microblogs sentiment orientation,and the average F1-measure is improved by 15% than that of the traditional methods.

Key words: microblog, short text, sentiment orientation analysis, unsupervised, Biterm Topic Model(BTM)

摘要: 基于传统主题模型的无监督情感倾向性分析方法不能较好地解决微博语料特征稀疏的问题。为此,提出一种新的无监督微博情感倾向性分析方法。对语料进行预处理并统计语料中的共现词对,利用BTM模型挖掘文档中的隐含主题,通过已有情感词典分析隐含主题的情感分布,并实现整条微博的情感倾向性分析。在NLP&CC2012语料上进行测试,结果表明,该方法能够有效识别微博的情感倾向,平均F1值比传统主题模型方法提高15%。

关键词: 微博, 短文本, 情感倾向性分析, 无监督, Biterm主题模型

CLC Number: