作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 人工智能及识别技术 • 上一篇    下一篇

基于词性特征与句法分析的商品评价对象提取

邱云飞a,陈艺方a,王伟a,邵良杉b   

  1. (辽宁工程技术大学 a.软件学院; b.系统工程研究所,辽宁 葫芦岛 125105)
  • 收稿日期:2015-07-17 出版日期:2016-07-15 发布日期:2016-07-15
  • 作者简介:邱云飞(1976-),男,教授、博士、CCF会员,主研方向为数据挖掘、情感分析;陈艺方,硕士研究生;王伟,讲师、硕士;邵良杉,教授、博士。
  • 基金资助:
    国家自然科学青年基金资助项目“二向性反射分布函数的先验知识耦合式融合方法研究”(61401185);辽宁省高等学校杰出青年学者成长计划基金资助项目(LJQ2012027);辽宁省教育厅一般基金资助项目(L2013131,L2013133)。

Commodity Opinion Target Extraction Based on Part of Speech Feature and Syntactic Analysis

QIU Yunfei  a,CHEN Yifang  a,WANG Wei  a,SHAO Liangshan  b   

  1. (a.School of Software; b.System Engineering Institute,Liaoning Technical University,Huludao,Liaoning 125105,China)
  • Received:2015-07-17 Online:2016-07-15 Published:2016-07-15

摘要: 针对中文在线评论中语言不规范以及多样性导致评价对象识别错误的问题,提出基于词性特征与句法分析的商品评价对象提取方法。根据中文语言特点,利用形容词、副词、动词的词性特征构建规则提取评价词。通过子句序列的句法树结构提取候选评价对象并进行过滤。基于核心句法路径筛选评价搭配,以减少提取过程中引入的评价对象以及评价词噪声,从而提取出真正的评价对象。实验结果表明,引入句法树结构与核心句法路径使得商品评价对象识别的F值达到80%以上。

关键词: 中文评价词, 评价对象, 句法树结构, 词性特征, 句法路径

Abstract: Aiming at the wrong recognition of opinion target in online Chinese comments caused by diverse and non-standard language, a method of commodity opinion target extraction based on the part of speech features and syntactic analysis is proposed. According to the characteristics of Chinese language, rules are constructed to extract evaluation words by part of speech features of adjectives, adverbs, and verbs. Through the syntactic tree structure of clause sequence, opinion target candidates are extracted and filtered. Evaluation collocation is screened based on the core syntactic paths to reduce the noise of opinion target and polarity word introduced in the process of extraction, thus, extracting the real opinion target. Experimental results show that the introduction of syntactic tree structure and core syntactic path makes the F value of commodity opinion target recognition over 80%.

Key words: Chinese evaluation word, opinion target, syntactic tree structure, part of speech feature, syntactic path

中图分类号: