作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2006, Vol. 32 ›› Issue (17): 74-76. doi: 10.3969/j.issn.1000-3428.2006.17.026

• 专题论文 • 上一篇    下一篇

基于语境信息的组合型分词歧义消解方法

曲维光1,2;吉根林2;穗志方1;周俊生2   

  1. (1. 北京大学计算语言学研究所,北京 100871;2. 南京师范大学计算机系,南京 210097)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2006-09-05 发布日期:2006-09-05

Context-based Approach to Covering Ambiguity Resolution in Chinese Word Segmentation

QU Weiguang1, 2;JI Genlin2; SUI Zhifang1;ZHOU Junsheng2   

  1. (1. Institute of Computational Linguistics, Peking Univ., Beijing 100871;
    2. Department of Computer Science, Nanjing Normal Univ., Nanjing 210097)
  • Received:1900-01-01 Revised:1900-01-01 Online:2006-09-05 Published:2006-09-05

摘要: 提出了相对词频的概念,据此建立了语境计算模型,利用歧义字段前后语境信息对组合型分词歧义进行消解。对高频出现的5个组合型分词歧义进行实验,平均准确率达到95%以上,证明该方法对于消解组合型分词歧义具有良好效果。

关键词: 中文自动分词, 组合型歧义, 相对词频, 语境计算模型

Abstract: The concept of relative word frequency (RWF) is proposed. A context calculation model is set up, which makes use of contextual information to resolute covering ambiguity in Chinese word segmentation. This paper selects 5 frequently used covering ambiguous words as examples, and the results show that the average accuracy is over 95%.


Key words: Chinese word segmentation, Covering ambiguity, Relative word frequency, Context calculation model

中图分类号: