作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 人工智能及识别技术 • 上一篇    下一篇

基于特征变换的跨领域产品评论倾向性分析

孟佳娜1,2,段晓东1,杨 亮2   

  1. (1. 大连民族学院计算机科学与工程学院,辽宁 大连 116600;2. 大连理工大学计算机科学与技术学院,辽宁 大连 116024)
  • 收稿日期:2012-09-18 出版日期:2013-10-15 发布日期:2013-10-14
  • 作者简介:孟佳娜(1972-),女,教授、博士,主研方向:自然语言处理,机器学习;段晓东,教授、博士;杨 亮,博士研究生
  • 基金资助:
    国家自然科学基金资助项目(61202254);中国博士后科学基金资助项目(2013M530918);中央高校自主科研基金资助项目(DC120101081, DC120101084);辽宁省教育厅科学研究基金资助一般项目(L2012478)

Opinion Analysis of Cross-domain Product Review Based on Feature Transformation

MENG Jia-na  1,2, DUAN Xiao-dong  1, YANG Liang  2   

  1. (1. School of Computer Science and Engineering, Dalian Nationalities University, Dalian 116600, China; 2. School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China)
  • Received:2012-09-18 Online:2013-10-15 Published:2013-10-14

摘要: 传统的情感倾向性分析方法主要针对同一领域的文本,对于不同领域的文本,传统方法效果较差。为解决该问题,提出一种基于特征变换的跨领域产品评论倾向性分析方法。通过领域独立词建立源领域和目标领域的领域依赖词之间的关联,将源领域的领域知识迁移到目标领域中,以解决数据分布不同造成的分类器效果下降的问题。使用产品评论文本作为语料进行实验,结果表明,在所有语料上基于支持向量机和逻辑回归方法的平均精度分别为76.61%和76.81%,均高于Baseline算法的平均结果。

关键词: 特征变换, 倾向性分析, 产品评论, 源领域, 目标领域, 领域独立词, 领域依赖词

Abstract: Traditional sentiment analysis methods aim at same domain documents, the performance becomes worse for different domain documents. To solve this problem, this paper presents an opinion analysis method of cross-domain product reviews based on feature transformation. This proposed method builds the relevance of domain dependent words between source domain and target domain via domain independent words so that it can transfer acknowledge from the source domain to the target domain. It solves the classifier performance decreasing problem due to different data distributions. The product reviews are used as a corpus in the experiment. The average accuracies are 76.61% and 76.81% by using the methods of Support Vector Machine(SVM) and logistic regression respectively in all corpora. The results are higher than Baseline algorithm.

Key words: feature transformation, opinion analysis, product review, source domain, target domain, domain independent word, domain dependent word

中图分类号: