摘要: 针对复杂机构名难以识别的问题,提出一种CCRF与规则相结合的识别方法。以CCRF为基础,利用特征融合设计特征模板,融合相应有效规则库,为复杂机构名识别提供决策。对1998年1月的《人民日报》语料库进行开放测试,实验结果显示,机构名识别的准确率为89.92%,召回率为91.41%,F1值为90.66%。
关键词:
机构名,
条件随机场,
规则库,
语料库,
识别
Abstract: Chinese organization names recognition is researched, and a new approach is proposed to recognize Chinese organization names, which combines Cascaded Conditional Random Fields(CCRF) with rules. Adding some effective rules in CCRF model and using effective feature model templates, it helps complicated organization names recognition for making decision for defect of character model. Experimental result shows the precision is 89.92%, recall is 91.41%, F-measure is 90.66% in People’s Daily(January, 1998), which proves the validity of the approach.
Key words:
organization name,
Conditional Random Fields(CRFs),
rule base,
corpus,
recognition
中图分类号:
杨晓东, 晏立, 尤慧丽. CCRF与规则相结合的中文机构名识别[J]. 计算机工程, 2011, 37(8): 169-171.
YANG Xiao-Dong, YAN Li, YOU Hui-Li. Chinese Organization Names Recognition Combined with CCRF and Rules[J]. Computer Engineering, 2011, 37(8): 169-171.