摘要: 提出一种基于条件随机场(CRFs)和领域规则的业务名称识别方法。通过实验词及词性的不同组合选择特征集合,由该特征训练得到CRFs模型,利用该模型测试得到业务术语,采用2-gram及编辑距离2种度量方式进行相似度计算,利用领域规则和相似度计算方法得到业务名称。实验结果证明了该方法的有效性。
关键词:
业务名称识别,
条件随机场,
文本相似度,
编辑距离
Abstract: This paper presents a method for service name recognition based on Conditional Random Fields(CRFs) and domain rules. It choses a characteristic set by different combinations of experimental words and their part of speech. It obtains a CRFs model, and uses this model to the test corpus. It extracts service terms, which support the acquisition of service name recognition. In similarity measurement, 2-gram and edit distance mothods are adopted. Experimental results prove the validity of the method.
Key words:
service name recognition,
Conditional Random Fields(CRFs),
text similarity,
edit distance
中图分类号:
赵延平, 曹存根, 谢丽聪. 基于CRFs和领域规则的业务名称识别[J]. 计算机工程, 2011, 37(11): 200-202.
DIAO Yan-Beng, CAO Cun-Gen, XIE Li-Cong-. Service Name Recognition Based on CRFs and Domain Rules[J]. Computer Engineering, 2011, 37(11): 200-202.