作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 人工智能及识别技术 • 上一篇    下一篇

基于证据理论的多分类器中文微博观点句识别

郭云龙,潘玉斌,张泽宇,李 莉   

  1. (西南大学计算机与信息科学学院,重庆 400715)
  • 收稿日期:2013-05-20 出版日期:2014-04-15 发布日期:2014-04-14
  • 作者简介:郭云龙(1990-),男,硕士研究生,主研方向:自然语言处理,语义网络;潘玉斌,本科生;张泽宇(通讯作者),硕士研究生;李 莉,教授。
  • 基金资助:
    国家自然科学基金资助项目(61170192)。

Multiple-classifiers Opinion Sentence Recognition in Chinese Micro-blog Based on D-S Theory

GUO Yun-long, PAN Yu-bin, ZHANG Ze-yu, LI Li   

  1. (School of Computer and Information Science, Southwest University, Chongqing 400715, China)
  • Received:2013-05-20 Online:2014-04-15 Published:2014-04-14

摘要: 随着新技术及社会网络的发展与普及,微博用户数据量剧增,与此相关的研究引起了学术界和工业界的关注。针对中文微博语句特点,通过对比多种特征选取方法,提出一种新的特征统计方法。根据构建的词语字典与词性字典,分析支持向量机、朴素贝叶斯、K最近邻等分类模型,并利用证据理论结合多分类器对中文微博观点句进行识别。采用中国计算机学会自然语言处理与中文计算会议(NLP&CC 2012)提供的数据,运用该方法得到的准确率、召回率和F值分别为70.6%、89.2%、78.9%,而NLP&CC 2012公布的评测结果相应平均值分别为72.7%、61.5%、64.7%,该方法在召回率和F值2个指标上超过其平均值,而F值比NLP&CC 2012评测结果的最好值高出0.5%。

关键词: 微博, 观点句, 支持向量机, 朴素贝叶斯, K近邻, 证据理论

Abstract: With the development and popularity of the new technology and social network, the data volume of micro-blog users surge sharply. Related research causes increasing attention from both academia and industry. This paper proposes a new statistical method on feature extraction. Classification performances of different schemas such as Support Vector Machine(SVM), Naive Bayes and K-Nearest Neighbour(KNN) are analyzed carefully. It proposes a combined model based on D-S theory to take the advantages of different classifiers. A series of experiments based on the Chinese Micro-Blog data provided by CCF NLP&CC 2012 are conducted, and it gets the average estimate 72.7% in precision, 61.5% in recall and 64.7% in F-measure of NLP&CC 2012 as a baseline. Experimental results show that the method can achieve significant enhancement in both recall and F-measure with 70.6%, 89.2% and 78.9%, respectively, and F-measure is even 0.5% higher than the best result of NLP&CC 2012.

Key words: micro-blog, opinion sentence, Support Vector Machine(SVM), Naive Bayes, K-Nearest Neighbour(KNN), D-S theory

中图分类号: