计算机工程

• 人工智能及识别技术 • 上一篇    下一篇

基于类序列规则的中文微博情感分类

郑诚1,2,沈磊1,2,代宁1,2   

  1. (1.安徽大学计算机科学与技术学院,合肥 230601;2.计算智能和信号处理教育部重点实验室,合肥 230601)
  • 收稿日期:2015-01-27 出版日期:2016-02-15 发布日期:2016-01-29
  • 作者简介:郑诚(1964-),男,副教授,博士,主研方向为智能语义信息检索、数据挖掘;沈磊、代宁,硕士研究生。
  • 基金项目:
    安徽省高校自然科学基金资助重点项目(KJ2013A020);安徽省自然科学基金资助项目(11040606M133)。

Chinese Microblog Emotion Classification Based on Class Sequential Rules

ZHENG Cheng  1,2,SHEN Lei  1,2,DAI Ning  1,2   

  1. (1.School of Computer Science and Technology,Anhui University,Hefei 230601,China; 2.Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education,Hefei 230601,China)
  • Received:2015-01-27 Online:2016-02-15 Published:2016-01-29

摘要: 研究中文微博文本的情感分类问题,介绍一种基于类序列规则的微博情感分类方法。通过情感词典和机器学习的方法获得微博文本中每个句子的2个潜在的情感标签,将每条微博文本看作是一个数据序列,从数据集中挖掘出类序列规则,从挖掘出的规则中提取出的有效特征并结合文本其他特征来训练分类器。在COAE会议提供的微博数据集上的实验结果表明该方法的有效性。

关键词: 情感分类, 微博文本, 类序列规则, 情感词典, 机器学习, 文本特征

Abstract: This paper studies the problem of emotion classification in Chinese microblog texts.It introduces a novel approach based on class sequential rules for emotion classification of microblog texts.This approach obtains two potential emotion labels for each sentence in a microblog text by using an emotion lexicon and a machine learning approach respectively,and regards each microblog text as a data sequence.It mines class sequential rules from the dataset.It derives new effective features from the mined rules for emotion classification of microblog texts and other text features to train classifier.Experimental results on a COAE dataset show its validity compared with the traditional methods.

Key words: emotion classification, microblog text, class sequential rule, emotion lexicon, machine learning, text feature

中图分类号: