Abstract:
After comparing the advantage and weakness of the statistical methods and the rule methods, an automatic method for the recognition of Chinese personal name based on both Transformation-based Error-driven Learning(TBL) approach and HowNet is presented. Using label corpus, the contexts of the names are tagged with different roles according to their functions in the recognition of Chinese personal name, and distilling the instances labeled by role tagging, combined with the TBL method and HowNet to distill the fit regulation. Rules and instances are assembled together to recognize personal name in the texts. Experimental results show that the combined method is more effective in Chinese name identification with high precision, recall rate and F value.
Key words:
Chinese personal name recognition,
Transformation-based Error-driven Learning(TBL),
HowNet,
corpus,
role tagging
摘要: 针对统计与规则这2种方法的优缺点,提出一种基于转换的错误驱动学习与知网相结合的中文人名自动识别方法。利用标注语料库,根据在人名识别中的作用对人名上下文环境进行角色标注,提取标注后的实例,并采用基于转换的错误驱动方法和知网对提取的实例进行可用规则提取,结合规则和实例对文本进行人名识别。实验结果表明,与其他方法相比,该方法的中文人名识别准确率、召回率和 F值均有明显提高。
关键词:
中文人名识别,
基于转换的错误驱动,
学习知网,
语料库,
角色标注
CLC Number:
LI Bei, ZHANG Lei. Recognition of Chinese Personal Name Based on Error-driven Learning and HowNet[J]. Computer Engineering, 2012, 38(12): 179-181.
李波, 张蕾. 基于错误驱动学习和知网的中文人名识别[J]. 计算机工程, 2012, 38(12): 179-181.