计算机工程 ›› 2012, Vol. 38 ›› Issue (12): 179-181.doi: 10.3969/j.issn.1000-3428.2012.12.053

• 人工智能及识别技术 • 上一篇    下一篇

基于错误驱动学习和知网的中文人名识别

李 波,张 蕾   

  1. (西北大学信息科学与技术学院,西安 710127)
  • 收稿日期:2011-11-14 出版日期:2012-06-20 发布日期:2012-06-20
  • 作者简介:李 波(1980-),男,硕士研究生,主研方向:人工智能,自然语言理解;张 蕾,教授、博士
  • 基金项目:
    陕西省自然科学基础研究计划基金资助项目(2010JM8031)

Recognition of Chinese Personal Name Based on Error-driven Learning and HowNet

LI Bo, ZHANG Lei   

  1. (College of Information Science & Technology, Northwest University, Xi’an 710127, China)
  • Received:2011-11-14 Online:2012-06-20 Published:2012-06-20

摘要: 针对统计与规则这2种方法的优缺点,提出一种基于转换的错误驱动学习与知网相结合的中文人名自动识别方法。利用标注语料库,根据在人名识别中的作用对人名上下文环境进行角色标注,提取标注后的实例,并采用基于转换的错误驱动方法和知网对提取的实例进行可用规则提取,结合规则和实例对文本进行人名识别。实验结果表明,与其他方法相比,该方法的中文人名识别准确率、召回率和 F值均有明显提高。

关键词: 中文人名识别, 基于转换的错误驱动, 学习知网, 语料库, 角色标注

Abstract: After comparing the advantage and weakness of the statistical methods and the rule methods, an automatic method for the recognition of Chinese personal name based on both Transformation-based Error-driven Learning(TBL) approach and HowNet is presented. Using label corpus, the contexts of the names are tagged with different roles according to their functions in the recognition of Chinese personal name, and distilling the instances labeled by role tagging, combined with the TBL method and HowNet to distill the fit regulation. Rules and instances are assembled together to recognize personal name in the texts. Experimental results show that the combined method is more effective in Chinese name identification with high precision, recall rate and F value.

Key words: Chinese personal name recognition, Transformation-based Error-driven Learning(TBL), HowNet, corpus, role tagging

中图分类号: