摘要: 提出了相对词频的概念,据此建立了语境计算模型,利用歧义字段前后语境信息对组合型分词歧义进行消解。对高频出现的5个组合型分词歧义进行实验,平均准确率达到95%以上,证明该方法对于消解组合型分词歧义具有良好效果。
关键词:
中文自动分词,
组合型歧义,
相对词频,
语境计算模型
Abstract: The concept of relative word frequency (RWF) is proposed. A context calculation model is set up, which makes use of contextual information to resolute covering ambiguity in Chinese word segmentation. This paper selects 5 frequently used covering ambiguous words as examples, and the results show that the average accuracy is over 95%.
Key words:
Chinese word segmentation,
Covering ambiguity,
Relative word frequency,
Context calculation model
中图分类号:
曲维光;;吉根林;穗志方;周俊生. 基于语境信息的组合型分词歧义消解方法[J]. 计算机工程, 2006, 32(17): 74-76.
QU Weiguang; ;JI Genlin; SUI Zhifang;ZHOU Junsheng. Context-based Approach to Covering Ambiguity Resolution in Chinese Word Segmentation[J]. Computer Engineering, 2006, 32(17): 74-76.