作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 人工智能及识别技术 • 上一篇    下一篇

基于频繁子树模式的评价对象抽取

田卫东,苗惠君   

  1. (合肥工业大学 计算机与信息学院,合肥 230009)
  • 收稿日期:2016-03-21 出版日期:2017-04-15 发布日期:2017-04-14
  • 作者简介:田卫东(1970—),男,副教授,主研方向为智能计算、数据挖掘;苗惠君,硕士研究生。
  • 基金资助:
    国家“863”计划项目(2012AA011005);国家自然科学基金(61273292);情感计算与先进智能机器安徽省重点实验室开放课题(ACAIM2015xxx)。

Extraction of Opinion Targets Based on Frequent Sub-tree Pattern

TIAN Weidong,MIAO Huijun   

  1. (School of Computer and Information,Hefei University of Technology,Hefei 230009,China)
  • Received:2016-03-21 Online:2017-04-15 Published:2017-04-14

摘要: 现有的评价对象抽取方法多基于启发式规则或者基于词性、词形等特征的机器学习方法,未能较好地利用依存分析所揭示出的深层句法关联关系。为此,基于从依存关系树库所挖掘的频繁树模式,提出一种针对中文评论性短文本的评价对象抽取方法。该方法基于依存关系频繁子树模式进行短文本的初始标注,采用错误驱动框架的方法提炼出能反映评价对象特征的频繁子树模式有序模式规则集,并利用该规则集进行评价对象的抽取。实验结果表明,该方法具有较好的稳定性与准确性,在召回率和F1值等评价指标上优于基于支持向量机的方法。

关键词: 依存句法, 短文本, 频繁子树模式, 错误驱动, 支持向量机

Abstract: Most existing opinion target extraction methods are based on the heuristic rules or machine learning using features such as part of speech,morphology and etc.,but the defect of these methods is that deep association relationship mined by dependency syntax analysis is not used.In order to solve this problem,a novel opinion target extraction method for Chinese short critical texts is proposed based on frequent tree patterns mined from dependency relation tree bank.First,this method labels the initial tagging opinion target based on frequent sub-tree patterns,and then it trains out an ordered rule set based on error-driven TBL framework which can be related to the combination of opinion targets.Finally,opinion target is extracted based on the ordered rule set.Experimental results show that this method has good stability and precision,and is better than Support Vector Machine(SVM)-based method on indicators such as recall and F1-score.

Key words: dependency syntax, short text, frequent sub-tree pattern, error driven, Support Vector Machine(SVM)

中图分类号: